Robotics
PyTorch
Cosmos
xperience10m_task_baseline_suite
embodied-ai
multimodal
xperience-10m
baseline
evaluation
qwen3-omni
Instructions to use cy0307/ropedia-xperience-10m-task-baselines with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use cy0307/ropedia-xperience-10m-task-baselines with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Add Qwen3-Omni held-out error analysis
Browse files- ARTIFACT_GUIDE.md +2 -0
- PROJECT_STATUS.md +1 -1
- data/artifact_index.json +46 -13
- data/mirror_parity.json +879 -79
- data/omni_finetune_verified_result.json +22 -1
- data/project_status.json +4 -2
- data/publication_audit.json +9 -9
- data/scope_claims_audit.json +1 -1
- data/task_surface_integrity.json +145 -145
- data/website_integrity.json +5 -5
- docs/data/artifact_index.json +46 -13
- docs/data/mirror_parity.json +366 -62
- docs/data/omni_finetune_verified_result.json +22 -1
- docs/data/project_status.json +4 -2
- docs/data/publication_audit.json +9 -9
- docs/data/scope_claims_audit.json +1 -1
- docs/data/task_surface_integrity.json +145 -145
- docs/data/website_integrity.json +5 -5
- metrics/artifact_index.json +46 -13
- metrics/mirror_parity.json +366 -62
- metrics/omni_finetune_verified_result.json +22 -1
- metrics/project_status.json +4 -2
- metrics/publication_audit.json +9 -9
- metrics/scope_claims_audit.json +1 -1
- metrics/task_surface_integrity.json +145 -145
- metrics/website_integrity.json +5 -5
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/PUBLIC_RESULT_SUMMARY.md +18 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md +78 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv +9 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv +15 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json +667 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv +2 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv +11 -0
- results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv +3 -0
- scripts/build_artifact_index.py +24 -0
- scripts/omni/analyze_qwen3_omni_errors.py +370 -0
- scripts/validate_mirror_parity.py +11 -0
ARTIFACT_GUIDE.md
CHANGED
|
@@ -110,12 +110,14 @@ research project.
|
|
| 110 |
| [`results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md`](results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md) | Documents the public multi-episode access path, selected 128-episode pilot plan, and data requirements. |
|
| 111 |
| [`docs/data/omni_finetune_verified_result.json`](docs/data/omni_finetune_verified_result.json) | Compact verified summary for the first selected-episode Qwen3-Omni diagnostic pilot, including split counts, held-out metrics, and the quality-target caveat. |
|
| 112 |
| [`results/omni_finetune/verified_public/`](results/omni_finetune/verified_public/) | Public-safe verified held-out result packages. These include metrics, predictions, reports, manifests, training metadata, validation summaries, and audit files, but not raw data or weights. |
|
|
|
|
| 113 |
| [`scripts/omni/discover_xperience10m_sources.py`](scripts/omni/discover_xperience10m_sources.py) | Discovery gate for valid multi-episode Xperience-10M sources. |
|
| 114 |
| [`scripts/omni/train_qwen3_omni_lora.py`](scripts/omni/train_qwen3_omni_lora.py) | Training entrypoint for the Qwen3-Omni LoRA pilot after the data gate passes. |
|
| 115 |
| [`scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh`](scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh) | Full 96/16/16 launcher with parallel export, 8-process LoRA training, validation-sample monitoring, held-out test evaluation, and quality-target reporting. |
|
| 116 |
| [`scripts/omni/merge_qwen3_omni_eval_shards.py`](scripts/omni/merge_qwen3_omni_eval_shards.py) | Recomputes held-out metrics from deterministic Qwen eval shards and checks missing or duplicate prediction ids. |
|
| 117 |
| [`scripts/omni/package_verified_omni_result.py`](scripts/omni/package_verified_omni_result.py) | Creates a contract-driven public-safe package from validated held-out fine-tuning outputs without raw data, base weights, adapter/checkpoint weights, full checkpoints, or large archives. |
|
| 118 |
| [`scripts/omni/audit_verified_omni_package.py`](scripts/omni/audit_verified_omni_package.py) | Audits a verified package before README, website, or Hugging Face updates by checking validation status, required files, primary metrics, held-out evidence, and forbidden file types. |
|
|
|
|
| 119 |
| [`scripts/omni/watch_verified_omni_package.py`](scripts/omni/watch_verified_omni_package.py) | Waits for a passing held-out eval validation and then runs the verified public-safe packager automatically. |
|
| 120 |
| [`OMNI_MODEL_EXTENSION_CONTRACT.md`](OMNI_MODEL_EXTENSION_CONTRACT.md) | Human-readable contract for adding new model families while preserving the same episode split, held-out evaluation, packaging gate, and public-safety boundary. |
|
| 121 |
| [`configs/omni_backbones/`](configs/omni_backbones/) | Backbone registry for implemented Qwen3-Omni LoRA plus planned Cosmos-style world-model and VLA/policy branches. |
|
|
|
|
| 110 |
| [`results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md`](results/omni_finetune/MULTI_EPISODE_ACCESS_STATUS.md) | Documents the public multi-episode access path, selected 128-episode pilot plan, and data requirements. |
|
| 111 |
| [`docs/data/omni_finetune_verified_result.json`](docs/data/omni_finetune_verified_result.json) | Compact verified summary for the first selected-episode Qwen3-Omni diagnostic pilot, including split counts, held-out metrics, and the quality-target caveat. |
|
| 112 |
| [`results/omni_finetune/verified_public/`](results/omni_finetune/verified_public/) | Public-safe verified held-out result packages. These include metrics, predictions, reports, manifests, training metadata, validation summaries, and audit files, but not raw data or weights. |
|
| 113 |
+
| [`results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md`](results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md) | Derived held-out error analysis by episode, action family, train-seen status, required-modality state, and object category for the validation-aware Qwen3-Omni diagnostic pilot. |
|
| 114 |
| [`scripts/omni/discover_xperience10m_sources.py`](scripts/omni/discover_xperience10m_sources.py) | Discovery gate for valid multi-episode Xperience-10M sources. |
|
| 115 |
| [`scripts/omni/train_qwen3_omni_lora.py`](scripts/omni/train_qwen3_omni_lora.py) | Training entrypoint for the Qwen3-Omni LoRA pilot after the data gate passes. |
|
| 116 |
| [`scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh`](scripts/omni/run_128_fullsplit_parallel_export_8gpu.sh) | Full 96/16/16 launcher with parallel export, 8-process LoRA training, validation-sample monitoring, held-out test evaluation, and quality-target reporting. |
|
| 117 |
| [`scripts/omni/merge_qwen3_omni_eval_shards.py`](scripts/omni/merge_qwen3_omni_eval_shards.py) | Recomputes held-out metrics from deterministic Qwen eval shards and checks missing or duplicate prediction ids. |
|
| 118 |
| [`scripts/omni/package_verified_omni_result.py`](scripts/omni/package_verified_omni_result.py) | Creates a contract-driven public-safe package from validated held-out fine-tuning outputs without raw data, base weights, adapter/checkpoint weights, full checkpoints, or large archives. |
|
| 119 |
| [`scripts/omni/audit_verified_omni_package.py`](scripts/omni/audit_verified_omni_package.py) | Audits a verified package before README, website, or Hugging Face updates by checking validation status, required files, primary metrics, held-out evidence, and forbidden file types. |
|
| 120 |
+
| [`scripts/omni/analyze_qwen3_omni_errors.py`](scripts/omni/analyze_qwen3_omni_errors.py) | Computes public-safe held-out error-analysis tables from the verified Qwen3-Omni prediction package. |
|
| 121 |
| [`scripts/omni/watch_verified_omni_package.py`](scripts/omni/watch_verified_omni_package.py) | Waits for a passing held-out eval validation and then runs the verified public-safe packager automatically. |
|
| 122 |
| [`OMNI_MODEL_EXTENSION_CONTRACT.md`](OMNI_MODEL_EXTENSION_CONTRACT.md) | Human-readable contract for adding new model families while preserving the same episode split, held-out evaluation, packaging gate, and public-safety boundary. |
|
| 123 |
| [`configs/omni_backbones/`](configs/omni_backbones/) | Backbone registry for implemented Qwen3-Omni LoRA plus planned Cosmos-style world-model and VLA/policy branches. |
|
PROJECT_STATUS.md
CHANGED
|
@@ -30,7 +30,7 @@ scale-up readiness; it is not presented as final full-dataset model quality.
|
|
| 30 |
| Public dashboard and Hub pages | Verified | GitHub Pages, HF Space, artifact dataset, baseline model repo, Qwen3-Omni LoRA repo | Readers can move between the website, code, derived artifacts, baseline weights, and Qwen3-Omni pilot status without needing local infrastructure details. |
|
| 31 |
| Public package policy | Verified | `DATA_NOTICE.md`, `REPRODUCIBILITY.md` | Raw Xperience-10M data, private gated files, large archives, credentials, and full Qwen weights are not redistributed. |
|
| 32 |
| Reproducibility | Verified for the public sample | `REPRODUCIBILITY.md`, `docs/data/reproducibility_matrix.json`, `notes/reproducibility_audit.md` | The public sample workflow has explicit commands, expected outputs, and exact-match reproduction evidence. |
|
| 33 |
-
| Qwen3-Omni fine-tuning | Verified validation-aware diagnostic held-out pilot; quality target not met | `docs/data/omni_finetune_verified_result.json`, `results/omni_finetune/verified_public/`, `scripts/omni/package_verified_omni_result.py`, `scripts/omni/audit_verified_omni_package.py` | The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows,
|
| 34 |
| Raw Xperience-10M redistribution | Not included | `DATA_NOTICE.md`, `docs/data/publication_audit.json` | Raw MP4, HDF5, RRD files, private gated data, and full Qwen weights are intentionally excluded. |
|
| 35 |
|
| 36 |
## Fast Research Route
|
|
|
|
| 30 |
| Public dashboard and Hub pages | Verified | GitHub Pages, HF Space, artifact dataset, baseline model repo, Qwen3-Omni LoRA repo | Readers can move between the website, code, derived artifacts, baseline weights, and Qwen3-Omni pilot status without needing local infrastructure details. |
|
| 31 |
| Public package policy | Verified | `DATA_NOTICE.md`, `REPRODUCIBILITY.md` | Raw Xperience-10M data, private gated files, large archives, credentials, and full Qwen weights are not redistributed. |
|
| 32 |
| Reproducibility | Verified for the public sample | `REPRODUCIBILITY.md`, `docs/data/reproducibility_matrix.json`, `notes/reproducibility_audit.md` | The public sample workflow has explicit commands, expected outputs, and exact-match reproduction evidence. |
|
| 33 |
+
| Qwen3-Omni fine-tuning | Verified validation-aware diagnostic held-out pilot; quality target not met | `docs/data/omni_finetune_verified_result.json`, `results/omni_finetune/verified_public/`, `results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/`, `scripts/omni/package_verified_omni_result.py`, `scripts/omni/audit_verified_omni_package.py`, `scripts/omni/analyze_qwen3_omni_errors.py` | The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows, 448 test predictions, and derived error-analysis tables by episode, action family, train-seen status, required-modality state, and object category. JSON validity is 87.50%, below the 98% target, so the result is a diagnostic baseline and the next pass should focus on structured-output improvements. |
|
| 34 |
| Raw Xperience-10M redistribution | Not included | `DATA_NOTICE.md`, `docs/data/publication_audit.json` | Raw MP4, HDF5, RRD files, private gated data, and full Qwen weights are intentionally excluded. |
|
| 35 |
|
| 36 |
## Fast Research Route
|
data/artifact_index.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"status": "pass",
|
| 5 |
-
"artifact_count":
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
-
"scaleup_contract":
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
@@ -28,7 +28,7 @@
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
-
"scaleup_status":
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
@@ -63,8 +63,8 @@
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
-
"bytes":
|
| 67 |
-
"sha256": "
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
@@ -74,8 +74,8 @@
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
-
"bytes":
|
| 78 |
-
"sha256": "
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
@@ -187,6 +187,17 @@
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
{
|
| 191 |
"id": "additional_development_directions",
|
| 192 |
"title": "Additional development directions",
|
|
@@ -250,8 +261,8 @@
|
|
| 250 |
"surface": "repo_hf",
|
| 251 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 252 |
"exists": true,
|
| 253 |
-
"bytes":
|
| 254 |
-
"sha256": "
|
| 255 |
},
|
| 256 |
{
|
| 257 |
"id": "official_dataset_card_alignment",
|
|
@@ -695,8 +706,8 @@
|
|
| 695 |
"surface": "repo_hf",
|
| 696 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 697 |
"exists": true,
|
| 698 |
-
"bytes":
|
| 699 |
-
"sha256": "
|
| 700 |
},
|
| 701 |
{
|
| 702 |
"id": "publication_audit",
|
|
@@ -731,7 +742,7 @@
|
|
| 731 |
"volatile": true,
|
| 732 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 733 |
"exists": true,
|
| 734 |
-
"bytes":
|
| 735 |
"hash_policy": "existence_and_size_only"
|
| 736 |
},
|
| 737 |
{
|
|
@@ -933,6 +944,28 @@
|
|
| 933 |
"bytes": 3076,
|
| 934 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"id": "citation",
|
| 938 |
"title": "Citation metadata",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:45+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
+
"artifact_count": 86,
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
+
"scaleup_contract": 7,
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
+
"scaleup_status": 4,
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
+
"bytes": 8805,
|
| 67 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
+
"bytes": 11274,
|
| 78 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
| 190 |
+
{
|
| 191 |
+
"id": "qwen3_omni_error_analysis_script",
|
| 192 |
+
"title": "Qwen3-Omni held-out error-analysis script",
|
| 193 |
+
"path": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 194 |
+
"kind": "scaleup_contract",
|
| 195 |
+
"surface": "repo_hf",
|
| 196 |
+
"shows": "Computes public-safe held-out error-analysis tables by episode, action family, train-seen status, required-modality state, and object category.",
|
| 197 |
+
"exists": true,
|
| 198 |
+
"bytes": 15676,
|
| 199 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 200 |
+
},
|
| 201 |
{
|
| 202 |
"id": "additional_development_directions",
|
| 203 |
"title": "Additional development directions",
|
|
|
|
| 261 |
"surface": "repo_hf",
|
| 262 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 263 |
"exists": true,
|
| 264 |
+
"bytes": 16318,
|
| 265 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 266 |
},
|
| 267 |
{
|
| 268 |
"id": "official_dataset_card_alignment",
|
|
|
|
| 706 |
"surface": "repo_hf",
|
| 707 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 708 |
"exists": true,
|
| 709 |
+
"bytes": 32191,
|
| 710 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 711 |
},
|
| 712 |
{
|
| 713 |
"id": "publication_audit",
|
|
|
|
| 742 |
"volatile": true,
|
| 743 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 744 |
"exists": true,
|
| 745 |
+
"bytes": 126335,
|
| 746 |
"hash_policy": "existence_and_size_only"
|
| 747 |
},
|
| 748 |
{
|
|
|
|
| 944 |
"bytes": 3076,
|
| 945 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 946 |
},
|
| 947 |
+
{
|
| 948 |
+
"id": "qwen3_omni_error_analysis_report",
|
| 949 |
+
"title": "Qwen3-Omni held-out error-analysis report",
|
| 950 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 951 |
+
"kind": "scaleup_status",
|
| 952 |
+
"surface": "repo_hf",
|
| 953 |
+
"shows": "Summarizes validation-aware Qwen3-Omni held-out failures by episode, action family, train-seen status, required-modality state, and object category.",
|
| 954 |
+
"exists": true,
|
| 955 |
+
"bytes": 3331,
|
| 956 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 957 |
+
},
|
| 958 |
+
{
|
| 959 |
+
"id": "qwen3_omni_error_analysis_json",
|
| 960 |
+
"title": "Qwen3-Omni held-out error-analysis JSON",
|
| 961 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 962 |
+
"kind": "scaleup_status",
|
| 963 |
+
"surface": "repo_hf",
|
| 964 |
+
"shows": "Machine-readable Qwen3-Omni held-out error analysis with grouped metrics and sanitized failure examples.",
|
| 965 |
+
"exists": true,
|
| 966 |
+
"bytes": 25202,
|
| 967 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 968 |
+
},
|
| 969 |
{
|
| 970 |
"id": "citation",
|
| 971 |
"title": "Citation metadata",
|
data/mirror_parity.json
CHANGED
|
@@ -1,16 +1,20 @@
|
|
| 1 |
{
|
| 2 |
-
"status": "
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
-
"group_count":
|
| 7 |
-
"failure_count":
|
| 8 |
-
"failures_by_surface": {
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
},
|
| 10 |
"checks": [
|
| 11 |
{
|
| 12 |
"name": "repo_hf_space_artifact_model_data_parity",
|
| 13 |
-
"status": "
|
| 14 |
},
|
| 15 |
{
|
| 16 |
"name": "repo_hf_visual_asset_parity",
|
|
@@ -18,7 +22,7 @@
|
|
| 18 |
},
|
| 19 |
{
|
| 20 |
"name": "repo_hf_validator_script_parity",
|
| 21 |
-
"status": "
|
| 22 |
},
|
| 23 |
{
|
| 24 |
"name": "repo_hf_website_html_parity",
|
|
@@ -26,7 +30,7 @@
|
|
| 26 |
},
|
| 27 |
{
|
| 28 |
"name": "repo_hf_diagnostic_result_parity",
|
| 29 |
-
"status": "
|
| 30 |
},
|
| 31 |
{
|
| 32 |
"name": "repo_hf_quality_doc_parity",
|
|
@@ -98,34 +102,56 @@
|
|
| 98 |
},
|
| 99 |
{
|
| 100 |
"name": "data/artifact_index.json",
|
| 101 |
-
"status": "
|
| 102 |
"local": {
|
| 103 |
"path": "repo:docs/data/artifact_index.json",
|
| 104 |
"exists": true,
|
| 105 |
-
"bytes":
|
| 106 |
-
"sha256": "
|
| 107 |
},
|
| 108 |
"mirrors": {
|
| 109 |
"hf_space": {
|
| 110 |
"path": "hf_space:data/artifact_index.json",
|
| 111 |
"exists": true,
|
| 112 |
-
"bytes":
|
| 113 |
-
"sha256": "
|
| 114 |
},
|
| 115 |
"hf_artifacts": {
|
| 116 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 117 |
"exists": true,
|
| 118 |
-
"bytes":
|
| 119 |
-
"sha256": "
|
| 120 |
},
|
| 121 |
"hf_model": {
|
| 122 |
"path": "hf_model:metrics/artifact_index.json",
|
| 123 |
"exists": true,
|
| 124 |
-
"bytes":
|
| 125 |
-
"sha256": "
|
| 126 |
}
|
| 127 |
},
|
| 128 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
},
|
| 130 |
{
|
| 131 |
"name": "data/brand_assets.json",
|
|
@@ -350,27 +376,27 @@
|
|
| 350 |
"local": {
|
| 351 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 352 |
"exists": true,
|
| 353 |
-
"bytes":
|
| 354 |
-
"sha256": "
|
| 355 |
},
|
| 356 |
"mirrors": {
|
| 357 |
"hf_space": {
|
| 358 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 359 |
"exists": true,
|
| 360 |
-
"bytes":
|
| 361 |
-
"sha256": "
|
| 362 |
},
|
| 363 |
"hf_artifacts": {
|
| 364 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 365 |
"exists": true,
|
| 366 |
-
"bytes":
|
| 367 |
-
"sha256": "
|
| 368 |
},
|
| 369 |
"hf_model": {
|
| 370 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 371 |
"exists": true,
|
| 372 |
-
"bytes":
|
| 373 |
-
"sha256": "
|
| 374 |
}
|
| 375 |
},
|
| 376 |
"failures": []
|
|
@@ -474,61 +500,83 @@
|
|
| 474 |
"local": {
|
| 475 |
"path": "repo:docs/data/project_status.json",
|
| 476 |
"exists": true,
|
| 477 |
-
"bytes":
|
| 478 |
-
"sha256": "
|
| 479 |
},
|
| 480 |
"mirrors": {
|
| 481 |
"hf_space": {
|
| 482 |
"path": "hf_space:data/project_status.json",
|
| 483 |
"exists": true,
|
| 484 |
-
"bytes":
|
| 485 |
-
"sha256": "
|
| 486 |
},
|
| 487 |
"hf_artifacts": {
|
| 488 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 489 |
"exists": true,
|
| 490 |
-
"bytes":
|
| 491 |
-
"sha256": "
|
| 492 |
},
|
| 493 |
"hf_model": {
|
| 494 |
"path": "hf_model:metrics/project_status.json",
|
| 495 |
"exists": true,
|
| 496 |
-
"bytes":
|
| 497 |
-
"sha256": "
|
| 498 |
}
|
| 499 |
},
|
| 500 |
"failures": []
|
| 501 |
},
|
| 502 |
{
|
| 503 |
"name": "data/publication_audit.json",
|
| 504 |
-
"status": "
|
| 505 |
"local": {
|
| 506 |
"path": "repo:docs/data/publication_audit.json",
|
| 507 |
"exists": true,
|
| 508 |
"bytes": 7237,
|
| 509 |
-
"sha256": "
|
| 510 |
},
|
| 511 |
"mirrors": {
|
| 512 |
"hf_space": {
|
| 513 |
"path": "hf_space:data/publication_audit.json",
|
| 514 |
"exists": true,
|
| 515 |
"bytes": 7237,
|
| 516 |
-
"sha256": "
|
| 517 |
},
|
| 518 |
"hf_artifacts": {
|
| 519 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 520 |
"exists": true,
|
| 521 |
"bytes": 7237,
|
| 522 |
-
"sha256": "
|
| 523 |
},
|
| 524 |
"hf_model": {
|
| 525 |
"path": "hf_model:metrics/publication_audit.json",
|
| 526 |
"exists": true,
|
| 527 |
"bytes": 7237,
|
| 528 |
-
"sha256": "
|
| 529 |
}
|
| 530 |
},
|
| 531 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 532 |
},
|
| 533 |
{
|
| 534 |
"name": "data/public_surface_qa.json",
|
|
@@ -811,34 +859,56 @@
|
|
| 811 |
},
|
| 812 |
{
|
| 813 |
"name": "data/scope_claims_audit.json",
|
| 814 |
-
"status": "
|
| 815 |
"local": {
|
| 816 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 817 |
"exists": true,
|
| 818 |
"bytes": 20823,
|
| 819 |
-
"sha256": "
|
| 820 |
},
|
| 821 |
"mirrors": {
|
| 822 |
"hf_space": {
|
| 823 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 824 |
"exists": true,
|
| 825 |
"bytes": 20823,
|
| 826 |
-
"sha256": "
|
| 827 |
},
|
| 828 |
"hf_artifacts": {
|
| 829 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 830 |
"exists": true,
|
| 831 |
"bytes": 20823,
|
| 832 |
-
"sha256": "
|
| 833 |
},
|
| 834 |
"hf_model": {
|
| 835 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 836 |
"exists": true,
|
| 837 |
"bytes": 20823,
|
| 838 |
-
"sha256": "
|
| 839 |
}
|
| 840 |
},
|
| 841 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 842 |
},
|
| 843 |
{
|
| 844 |
"name": "data/single_episode_explorer.json",
|
|
@@ -935,34 +1005,56 @@
|
|
| 935 |
},
|
| 936 |
{
|
| 937 |
"name": "data/task_surface_integrity.json",
|
| 938 |
-
"status": "
|
| 939 |
"local": {
|
| 940 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 941 |
"exists": true,
|
| 942 |
"bytes": 45779,
|
| 943 |
-
"sha256": "
|
| 944 |
},
|
| 945 |
"mirrors": {
|
| 946 |
"hf_space": {
|
| 947 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 948 |
"exists": true,
|
| 949 |
"bytes": 45779,
|
| 950 |
-
"sha256": "
|
| 951 |
},
|
| 952 |
"hf_artifacts": {
|
| 953 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 954 |
"exists": true,
|
| 955 |
"bytes": 45779,
|
| 956 |
-
"sha256": "
|
| 957 |
},
|
| 958 |
"hf_model": {
|
| 959 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 960 |
"exists": true,
|
| 961 |
"bytes": 45779,
|
| 962 |
-
"sha256": "
|
| 963 |
}
|
| 964 |
},
|
| 965 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 966 |
},
|
| 967 |
{
|
| 968 |
"name": "data/task_walkthroughs.json",
|
|
@@ -997,34 +1089,56 @@
|
|
| 997 |
},
|
| 998 |
{
|
| 999 |
"name": "data/website_integrity.json",
|
| 1000 |
-
"status": "
|
| 1001 |
"local": {
|
| 1002 |
"path": "repo:docs/data/website_integrity.json",
|
| 1003 |
"exists": true,
|
| 1004 |
"bytes": 15221,
|
| 1005 |
-
"sha256": "
|
| 1006 |
},
|
| 1007 |
"mirrors": {
|
| 1008 |
"hf_space": {
|
| 1009 |
"path": "hf_space:data/website_integrity.json",
|
| 1010 |
"exists": true,
|
| 1011 |
"bytes": 15221,
|
| 1012 |
-
"sha256": "
|
| 1013 |
},
|
| 1014 |
"hf_artifacts": {
|
| 1015 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1016 |
"exists": true,
|
| 1017 |
"bytes": 15221,
|
| 1018 |
-
"sha256": "
|
| 1019 |
},
|
| 1020 |
"hf_model": {
|
| 1021 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1022 |
"exists": true,
|
| 1023 |
"bytes": 15221,
|
| 1024 |
-
"sha256": "
|
| 1025 |
}
|
| 1026 |
},
|
| 1027 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1028 |
},
|
| 1029 |
{
|
| 1030 |
"name": "data/xperience10m_dataset_card_alignment.json",
|
|
@@ -1723,6 +1837,46 @@
|
|
| 1723 |
},
|
| 1724 |
"failures": []
|
| 1725 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1726 |
{
|
| 1727 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1728 |
"status": "pass",
|
|
@@ -1754,21 +1908,21 @@
|
|
| 1754 |
"local": {
|
| 1755 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1756 |
"exists": true,
|
| 1757 |
-
"bytes":
|
| 1758 |
-
"sha256": "
|
| 1759 |
},
|
| 1760 |
"mirrors": {
|
| 1761 |
"hf_artifacts": {
|
| 1762 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1763 |
"exists": true,
|
| 1764 |
-
"bytes":
|
| 1765 |
-
"sha256": "
|
| 1766 |
},
|
| 1767 |
"hf_model": {
|
| 1768 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1769 |
"exists": true,
|
| 1770 |
-
"bytes":
|
| 1771 |
-
"sha256": "
|
| 1772 |
}
|
| 1773 |
},
|
| 1774 |
"failures": []
|
|
@@ -2054,21 +2208,21 @@
|
|
| 2054 |
"local": {
|
| 2055 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2056 |
"exists": true,
|
| 2057 |
-
"bytes":
|
| 2058 |
-
"sha256": "
|
| 2059 |
},
|
| 2060 |
"mirrors": {
|
| 2061 |
"hf_artifacts": {
|
| 2062 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2063 |
"exists": true,
|
| 2064 |
-
"bytes":
|
| 2065 |
-
"sha256": "
|
| 2066 |
},
|
| 2067 |
"hf_model": {
|
| 2068 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2069 |
"exists": true,
|
| 2070 |
-
"bytes":
|
| 2071 |
-
"sha256": "
|
| 2072 |
}
|
| 2073 |
},
|
| 2074 |
"failures": []
|
|
@@ -2807,6 +2961,395 @@
|
|
| 2807 |
},
|
| 2808 |
"failures": []
|
| 2809 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2810 |
{
|
| 2811 |
"name": "docs/QUALITY_GATES.md",
|
| 2812 |
"status": "pass",
|
|
@@ -3061,27 +3604,27 @@
|
|
| 3061 |
"local": {
|
| 3062 |
"path": "repo:PROJECT_STATUS.md",
|
| 3063 |
"exists": true,
|
| 3064 |
-
"bytes":
|
| 3065 |
-
"sha256": "
|
| 3066 |
},
|
| 3067 |
"mirrors": {
|
| 3068 |
"hf_space": {
|
| 3069 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3070 |
"exists": true,
|
| 3071 |
-
"bytes":
|
| 3072 |
-
"sha256": "
|
| 3073 |
},
|
| 3074 |
"hf_artifacts": {
|
| 3075 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3076 |
"exists": true,
|
| 3077 |
-
"bytes":
|
| 3078 |
-
"sha256": "
|
| 3079 |
},
|
| 3080 |
"hf_model": {
|
| 3081 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3082 |
"exists": true,
|
| 3083 |
-
"bytes":
|
| 3084 |
-
"sha256": "
|
| 3085 |
}
|
| 3086 |
},
|
| 3087 |
"failures": []
|
|
@@ -3211,5 +3754,262 @@
|
|
| 3211 |
"failures": []
|
| 3212 |
}
|
| 3213 |
],
|
| 3214 |
-
"failures": [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3215 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"status": "fail",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:55:21+00:00",
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
+
"group_count": 114,
|
| 7 |
+
"failure_count": 32,
|
| 8 |
+
"failures_by_surface": {
|
| 9 |
+
"hf_space": 10,
|
| 10 |
+
"hf_artifacts": 11,
|
| 11 |
+
"hf_model": 11
|
| 12 |
+
}
|
| 13 |
},
|
| 14 |
"checks": [
|
| 15 |
{
|
| 16 |
"name": "repo_hf_space_artifact_model_data_parity",
|
| 17 |
+
"status": "fail"
|
| 18 |
},
|
| 19 |
{
|
| 20 |
"name": "repo_hf_visual_asset_parity",
|
|
|
|
| 22 |
},
|
| 23 |
{
|
| 24 |
"name": "repo_hf_validator_script_parity",
|
| 25 |
+
"status": "fail"
|
| 26 |
},
|
| 27 |
{
|
| 28 |
"name": "repo_hf_website_html_parity",
|
|
|
|
| 30 |
},
|
| 31 |
{
|
| 32 |
"name": "repo_hf_diagnostic_result_parity",
|
| 33 |
+
"status": "fail"
|
| 34 |
},
|
| 35 |
{
|
| 36 |
"name": "repo_hf_quality_doc_parity",
|
|
|
|
| 102 |
},
|
| 103 |
{
|
| 104 |
"name": "data/artifact_index.json",
|
| 105 |
+
"status": "fail",
|
| 106 |
"local": {
|
| 107 |
"path": "repo:docs/data/artifact_index.json",
|
| 108 |
"exists": true,
|
| 109 |
+
"bytes": 39486,
|
| 110 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 111 |
},
|
| 112 |
"mirrors": {
|
| 113 |
"hf_space": {
|
| 114 |
"path": "hf_space:data/artifact_index.json",
|
| 115 |
"exists": true,
|
| 116 |
+
"bytes": 39486,
|
| 117 |
+
"sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 118 |
},
|
| 119 |
"hf_artifacts": {
|
| 120 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 121 |
"exists": true,
|
| 122 |
+
"bytes": 39486,
|
| 123 |
+
"sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 124 |
},
|
| 125 |
"hf_model": {
|
| 126 |
"path": "hf_model:metrics/artifact_index.json",
|
| 127 |
"exists": true,
|
| 128 |
+
"bytes": 39486,
|
| 129 |
+
"sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 130 |
}
|
| 131 |
},
|
| 132 |
+
"failures": [
|
| 133 |
+
{
|
| 134 |
+
"surface": "hf_space",
|
| 135 |
+
"kind": "hash_mismatch",
|
| 136 |
+
"path": "hf_space:data/artifact_index.json",
|
| 137 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 138 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 139 |
+
},
|
| 140 |
+
{
|
| 141 |
+
"surface": "hf_artifacts",
|
| 142 |
+
"kind": "hash_mismatch",
|
| 143 |
+
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 144 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 145 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 146 |
+
},
|
| 147 |
+
{
|
| 148 |
+
"surface": "hf_model",
|
| 149 |
+
"kind": "hash_mismatch",
|
| 150 |
+
"path": "hf_model:metrics/artifact_index.json",
|
| 151 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 152 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 153 |
+
}
|
| 154 |
+
]
|
| 155 |
},
|
| 156 |
{
|
| 157 |
"name": "data/brand_assets.json",
|
|
|
|
| 376 |
"local": {
|
| 377 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 378 |
"exists": true,
|
| 379 |
+
"bytes": 4142,
|
| 380 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 381 |
},
|
| 382 |
"mirrors": {
|
| 383 |
"hf_space": {
|
| 384 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 385 |
"exists": true,
|
| 386 |
+
"bytes": 4142,
|
| 387 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 388 |
},
|
| 389 |
"hf_artifacts": {
|
| 390 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 391 |
"exists": true,
|
| 392 |
+
"bytes": 4142,
|
| 393 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 394 |
},
|
| 395 |
"hf_model": {
|
| 396 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 397 |
"exists": true,
|
| 398 |
+
"bytes": 4142,
|
| 399 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 400 |
}
|
| 401 |
},
|
| 402 |
"failures": []
|
|
|
|
| 500 |
"local": {
|
| 501 |
"path": "repo:docs/data/project_status.json",
|
| 502 |
"exists": true,
|
| 503 |
+
"bytes": 11274,
|
| 504 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 505 |
},
|
| 506 |
"mirrors": {
|
| 507 |
"hf_space": {
|
| 508 |
"path": "hf_space:data/project_status.json",
|
| 509 |
"exists": true,
|
| 510 |
+
"bytes": 11274,
|
| 511 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 512 |
},
|
| 513 |
"hf_artifacts": {
|
| 514 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 515 |
"exists": true,
|
| 516 |
+
"bytes": 11274,
|
| 517 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 518 |
},
|
| 519 |
"hf_model": {
|
| 520 |
"path": "hf_model:metrics/project_status.json",
|
| 521 |
"exists": true,
|
| 522 |
+
"bytes": 11274,
|
| 523 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 524 |
}
|
| 525 |
},
|
| 526 |
"failures": []
|
| 527 |
},
|
| 528 |
{
|
| 529 |
"name": "data/publication_audit.json",
|
| 530 |
+
"status": "fail",
|
| 531 |
"local": {
|
| 532 |
"path": "repo:docs/data/publication_audit.json",
|
| 533 |
"exists": true,
|
| 534 |
"bytes": 7237,
|
| 535 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 536 |
},
|
| 537 |
"mirrors": {
|
| 538 |
"hf_space": {
|
| 539 |
"path": "hf_space:data/publication_audit.json",
|
| 540 |
"exists": true,
|
| 541 |
"bytes": 7237,
|
| 542 |
+
"sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 543 |
},
|
| 544 |
"hf_artifacts": {
|
| 545 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 546 |
"exists": true,
|
| 547 |
"bytes": 7237,
|
| 548 |
+
"sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 549 |
},
|
| 550 |
"hf_model": {
|
| 551 |
"path": "hf_model:metrics/publication_audit.json",
|
| 552 |
"exists": true,
|
| 553 |
"bytes": 7237,
|
| 554 |
+
"sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 555 |
}
|
| 556 |
},
|
| 557 |
+
"failures": [
|
| 558 |
+
{
|
| 559 |
+
"surface": "hf_space",
|
| 560 |
+
"kind": "hash_mismatch",
|
| 561 |
+
"path": "hf_space:data/publication_audit.json",
|
| 562 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 563 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 564 |
+
},
|
| 565 |
+
{
|
| 566 |
+
"surface": "hf_artifacts",
|
| 567 |
+
"kind": "hash_mismatch",
|
| 568 |
+
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 569 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 570 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 571 |
+
},
|
| 572 |
+
{
|
| 573 |
+
"surface": "hf_model",
|
| 574 |
+
"kind": "hash_mismatch",
|
| 575 |
+
"path": "hf_model:metrics/publication_audit.json",
|
| 576 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 577 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 578 |
+
}
|
| 579 |
+
]
|
| 580 |
},
|
| 581 |
{
|
| 582 |
"name": "data/public_surface_qa.json",
|
|
|
|
| 859 |
},
|
| 860 |
{
|
| 861 |
"name": "data/scope_claims_audit.json",
|
| 862 |
+
"status": "fail",
|
| 863 |
"local": {
|
| 864 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 865 |
"exists": true,
|
| 866 |
"bytes": 20823,
|
| 867 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 868 |
},
|
| 869 |
"mirrors": {
|
| 870 |
"hf_space": {
|
| 871 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 872 |
"exists": true,
|
| 873 |
"bytes": 20823,
|
| 874 |
+
"sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 875 |
},
|
| 876 |
"hf_artifacts": {
|
| 877 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 878 |
"exists": true,
|
| 879 |
"bytes": 20823,
|
| 880 |
+
"sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 881 |
},
|
| 882 |
"hf_model": {
|
| 883 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 884 |
"exists": true,
|
| 885 |
"bytes": 20823,
|
| 886 |
+
"sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 887 |
}
|
| 888 |
},
|
| 889 |
+
"failures": [
|
| 890 |
+
{
|
| 891 |
+
"surface": "hf_space",
|
| 892 |
+
"kind": "hash_mismatch",
|
| 893 |
+
"path": "hf_space:data/scope_claims_audit.json",
|
| 894 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 895 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 896 |
+
},
|
| 897 |
+
{
|
| 898 |
+
"surface": "hf_artifacts",
|
| 899 |
+
"kind": "hash_mismatch",
|
| 900 |
+
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 901 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 902 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 903 |
+
},
|
| 904 |
+
{
|
| 905 |
+
"surface": "hf_model",
|
| 906 |
+
"kind": "hash_mismatch",
|
| 907 |
+
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 908 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 909 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 910 |
+
}
|
| 911 |
+
]
|
| 912 |
},
|
| 913 |
{
|
| 914 |
"name": "data/single_episode_explorer.json",
|
|
|
|
| 1005 |
},
|
| 1006 |
{
|
| 1007 |
"name": "data/task_surface_integrity.json",
|
| 1008 |
+
"status": "fail",
|
| 1009 |
"local": {
|
| 1010 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 1011 |
"exists": true,
|
| 1012 |
"bytes": 45779,
|
| 1013 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 1014 |
},
|
| 1015 |
"mirrors": {
|
| 1016 |
"hf_space": {
|
| 1017 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 1018 |
"exists": true,
|
| 1019 |
"bytes": 45779,
|
| 1020 |
+
"sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1021 |
},
|
| 1022 |
"hf_artifacts": {
|
| 1023 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 1024 |
"exists": true,
|
| 1025 |
"bytes": 45779,
|
| 1026 |
+
"sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1027 |
},
|
| 1028 |
"hf_model": {
|
| 1029 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 1030 |
"exists": true,
|
| 1031 |
"bytes": 45779,
|
| 1032 |
+
"sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1033 |
}
|
| 1034 |
},
|
| 1035 |
+
"failures": [
|
| 1036 |
+
{
|
| 1037 |
+
"surface": "hf_space",
|
| 1038 |
+
"kind": "hash_mismatch",
|
| 1039 |
+
"path": "hf_space:data/task_surface_integrity.json",
|
| 1040 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 1041 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1042 |
+
},
|
| 1043 |
+
{
|
| 1044 |
+
"surface": "hf_artifacts",
|
| 1045 |
+
"kind": "hash_mismatch",
|
| 1046 |
+
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 1047 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 1048 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1049 |
+
},
|
| 1050 |
+
{
|
| 1051 |
+
"surface": "hf_model",
|
| 1052 |
+
"kind": "hash_mismatch",
|
| 1053 |
+
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 1054 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 1055 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 1056 |
+
}
|
| 1057 |
+
]
|
| 1058 |
},
|
| 1059 |
{
|
| 1060 |
"name": "data/task_walkthroughs.json",
|
|
|
|
| 1089 |
},
|
| 1090 |
{
|
| 1091 |
"name": "data/website_integrity.json",
|
| 1092 |
+
"status": "fail",
|
| 1093 |
"local": {
|
| 1094 |
"path": "repo:docs/data/website_integrity.json",
|
| 1095 |
"exists": true,
|
| 1096 |
"bytes": 15221,
|
| 1097 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1098 |
},
|
| 1099 |
"mirrors": {
|
| 1100 |
"hf_space": {
|
| 1101 |
"path": "hf_space:data/website_integrity.json",
|
| 1102 |
"exists": true,
|
| 1103 |
"bytes": 15221,
|
| 1104 |
+
"sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1105 |
},
|
| 1106 |
"hf_artifacts": {
|
| 1107 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1108 |
"exists": true,
|
| 1109 |
"bytes": 15221,
|
| 1110 |
+
"sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1111 |
},
|
| 1112 |
"hf_model": {
|
| 1113 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1114 |
"exists": true,
|
| 1115 |
"bytes": 15221,
|
| 1116 |
+
"sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1117 |
}
|
| 1118 |
},
|
| 1119 |
+
"failures": [
|
| 1120 |
+
{
|
| 1121 |
+
"surface": "hf_space",
|
| 1122 |
+
"kind": "hash_mismatch",
|
| 1123 |
+
"path": "hf_space:data/website_integrity.json",
|
| 1124 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 1125 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1126 |
+
},
|
| 1127 |
+
{
|
| 1128 |
+
"surface": "hf_artifacts",
|
| 1129 |
+
"kind": "hash_mismatch",
|
| 1130 |
+
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1131 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 1132 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1133 |
+
},
|
| 1134 |
+
{
|
| 1135 |
+
"surface": "hf_model",
|
| 1136 |
+
"kind": "hash_mismatch",
|
| 1137 |
+
"path": "hf_model:metrics/website_integrity.json",
|
| 1138 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 1139 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 1140 |
+
}
|
| 1141 |
+
]
|
| 1142 |
},
|
| 1143 |
{
|
| 1144 |
"name": "data/xperience10m_dataset_card_alignment.json",
|
|
|
|
| 1837 |
},
|
| 1838 |
"failures": []
|
| 1839 |
},
|
| 1840 |
+
{
|
| 1841 |
+
"name": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1842 |
+
"status": "fail",
|
| 1843 |
+
"local": {
|
| 1844 |
+
"path": "repo:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1845 |
+
"exists": true,
|
| 1846 |
+
"bytes": 15676,
|
| 1847 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1848 |
+
},
|
| 1849 |
+
"mirrors": {
|
| 1850 |
+
"hf_artifacts": {
|
| 1851 |
+
"path": "hf_artifacts:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1852 |
+
"exists": true,
|
| 1853 |
+
"bytes": 15655,
|
| 1854 |
+
"sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 1855 |
+
},
|
| 1856 |
+
"hf_model": {
|
| 1857 |
+
"path": "hf_model:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1858 |
+
"exists": true,
|
| 1859 |
+
"bytes": 15655,
|
| 1860 |
+
"sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 1861 |
+
}
|
| 1862 |
+
},
|
| 1863 |
+
"failures": [
|
| 1864 |
+
{
|
| 1865 |
+
"surface": "hf_artifacts",
|
| 1866 |
+
"kind": "hash_mismatch",
|
| 1867 |
+
"path": "hf_artifacts:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1868 |
+
"expected_sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337",
|
| 1869 |
+
"actual_sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 1870 |
+
},
|
| 1871 |
+
{
|
| 1872 |
+
"surface": "hf_model",
|
| 1873 |
+
"kind": "hash_mismatch",
|
| 1874 |
+
"path": "hf_model:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1875 |
+
"expected_sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337",
|
| 1876 |
+
"actual_sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 1877 |
+
}
|
| 1878 |
+
]
|
| 1879 |
+
},
|
| 1880 |
{
|
| 1881 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1882 |
"status": "pass",
|
|
|
|
| 1908 |
"local": {
|
| 1909 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1910 |
"exists": true,
|
| 1911 |
+
"bytes": 32191,
|
| 1912 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1913 |
},
|
| 1914 |
"mirrors": {
|
| 1915 |
"hf_artifacts": {
|
| 1916 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1917 |
"exists": true,
|
| 1918 |
+
"bytes": 32191,
|
| 1919 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1920 |
},
|
| 1921 |
"hf_model": {
|
| 1922 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1923 |
"exists": true,
|
| 1924 |
+
"bytes": 32191,
|
| 1925 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1926 |
}
|
| 1927 |
},
|
| 1928 |
"failures": []
|
|
|
|
| 2208 |
"local": {
|
| 2209 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2210 |
"exists": true,
|
| 2211 |
+
"bytes": 13781,
|
| 2212 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2213 |
},
|
| 2214 |
"mirrors": {
|
| 2215 |
"hf_artifacts": {
|
| 2216 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2217 |
"exists": true,
|
| 2218 |
+
"bytes": 13781,
|
| 2219 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2220 |
},
|
| 2221 |
"hf_model": {
|
| 2222 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2223 |
"exists": true,
|
| 2224 |
+
"bytes": 13781,
|
| 2225 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2226 |
}
|
| 2227 |
},
|
| 2228 |
"failures": []
|
|
|
|
| 2961 |
},
|
| 2962 |
"failures": []
|
| 2963 |
},
|
| 2964 |
+
{
|
| 2965 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2966 |
+
"status": "pass",
|
| 2967 |
+
"local": {
|
| 2968 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2969 |
+
"exists": true,
|
| 2970 |
+
"bytes": 3331,
|
| 2971 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2972 |
+
},
|
| 2973 |
+
"mirrors": {
|
| 2974 |
+
"hf_space": {
|
| 2975 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2976 |
+
"exists": true,
|
| 2977 |
+
"bytes": 3331,
|
| 2978 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2979 |
+
},
|
| 2980 |
+
"hf_artifacts": {
|
| 2981 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2982 |
+
"exists": true,
|
| 2983 |
+
"bytes": 3331,
|
| 2984 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2985 |
+
},
|
| 2986 |
+
"hf_model": {
|
| 2987 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2988 |
+
"exists": true,
|
| 2989 |
+
"bytes": 3331,
|
| 2990 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2991 |
+
}
|
| 2992 |
+
},
|
| 2993 |
+
"failures": []
|
| 2994 |
+
},
|
| 2995 |
+
{
|
| 2996 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2997 |
+
"status": "pass",
|
| 2998 |
+
"local": {
|
| 2999 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 3000 |
+
"exists": true,
|
| 3001 |
+
"bytes": 25202,
|
| 3002 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 3003 |
+
},
|
| 3004 |
+
"mirrors": {
|
| 3005 |
+
"hf_space": {
|
| 3006 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 3007 |
+
"exists": true,
|
| 3008 |
+
"bytes": 25202,
|
| 3009 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 3010 |
+
},
|
| 3011 |
+
"hf_artifacts": {
|
| 3012 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 3013 |
+
"exists": true,
|
| 3014 |
+
"bytes": 25202,
|
| 3015 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 3016 |
+
},
|
| 3017 |
+
"hf_model": {
|
| 3018 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 3019 |
+
"exists": true,
|
| 3020 |
+
"bytes": 25202,
|
| 3021 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 3022 |
+
}
|
| 3023 |
+
},
|
| 3024 |
+
"failures": []
|
| 3025 |
+
},
|
| 3026 |
+
{
|
| 3027 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3028 |
+
"status": "fail",
|
| 3029 |
+
"local": {
|
| 3030 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3031 |
+
"exists": true,
|
| 3032 |
+
"bytes": 2121,
|
| 3033 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 3034 |
+
},
|
| 3035 |
+
"mirrors": {
|
| 3036 |
+
"hf_space": {
|
| 3037 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3038 |
+
"exists": true,
|
| 3039 |
+
"bytes": 2136,
|
| 3040 |
+
"sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3041 |
+
},
|
| 3042 |
+
"hf_artifacts": {
|
| 3043 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3044 |
+
"exists": true,
|
| 3045 |
+
"bytes": 2136,
|
| 3046 |
+
"sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3047 |
+
},
|
| 3048 |
+
"hf_model": {
|
| 3049 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3050 |
+
"exists": true,
|
| 3051 |
+
"bytes": 2136,
|
| 3052 |
+
"sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3053 |
+
}
|
| 3054 |
+
},
|
| 3055 |
+
"failures": [
|
| 3056 |
+
{
|
| 3057 |
+
"surface": "hf_space",
|
| 3058 |
+
"kind": "hash_mismatch",
|
| 3059 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3060 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3061 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3062 |
+
},
|
| 3063 |
+
{
|
| 3064 |
+
"surface": "hf_artifacts",
|
| 3065 |
+
"kind": "hash_mismatch",
|
| 3066 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3067 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3068 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3069 |
+
},
|
| 3070 |
+
{
|
| 3071 |
+
"surface": "hf_model",
|
| 3072 |
+
"kind": "hash_mismatch",
|
| 3073 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3074 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3075 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3076 |
+
}
|
| 3077 |
+
]
|
| 3078 |
+
},
|
| 3079 |
+
{
|
| 3080 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3081 |
+
"status": "fail",
|
| 3082 |
+
"local": {
|
| 3083 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3084 |
+
"exists": true,
|
| 3085 |
+
"bytes": 1320,
|
| 3086 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 3087 |
+
},
|
| 3088 |
+
"mirrors": {
|
| 3089 |
+
"hf_space": {
|
| 3090 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3091 |
+
"exists": true,
|
| 3092 |
+
"bytes": 1329,
|
| 3093 |
+
"sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3094 |
+
},
|
| 3095 |
+
"hf_artifacts": {
|
| 3096 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3097 |
+
"exists": true,
|
| 3098 |
+
"bytes": 1329,
|
| 3099 |
+
"sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3100 |
+
},
|
| 3101 |
+
"hf_model": {
|
| 3102 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3103 |
+
"exists": true,
|
| 3104 |
+
"bytes": 1329,
|
| 3105 |
+
"sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3106 |
+
}
|
| 3107 |
+
},
|
| 3108 |
+
"failures": [
|
| 3109 |
+
{
|
| 3110 |
+
"surface": "hf_space",
|
| 3111 |
+
"kind": "hash_mismatch",
|
| 3112 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3113 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3114 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3115 |
+
},
|
| 3116 |
+
{
|
| 3117 |
+
"surface": "hf_artifacts",
|
| 3118 |
+
"kind": "hash_mismatch",
|
| 3119 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3120 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3121 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3122 |
+
},
|
| 3123 |
+
{
|
| 3124 |
+
"surface": "hf_model",
|
| 3125 |
+
"kind": "hash_mismatch",
|
| 3126 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3127 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3128 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3129 |
+
}
|
| 3130 |
+
]
|
| 3131 |
+
},
|
| 3132 |
+
{
|
| 3133 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3134 |
+
"status": "fail",
|
| 3135 |
+
"local": {
|
| 3136 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3137 |
+
"exists": true,
|
| 3138 |
+
"bytes": 572,
|
| 3139 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 3140 |
+
},
|
| 3141 |
+
"mirrors": {
|
| 3142 |
+
"hf_space": {
|
| 3143 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3144 |
+
"exists": true,
|
| 3145 |
+
"bytes": 575,
|
| 3146 |
+
"sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3147 |
+
},
|
| 3148 |
+
"hf_artifacts": {
|
| 3149 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3150 |
+
"exists": true,
|
| 3151 |
+
"bytes": 575,
|
| 3152 |
+
"sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3153 |
+
},
|
| 3154 |
+
"hf_model": {
|
| 3155 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3156 |
+
"exists": true,
|
| 3157 |
+
"bytes": 575,
|
| 3158 |
+
"sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3159 |
+
}
|
| 3160 |
+
},
|
| 3161 |
+
"failures": [
|
| 3162 |
+
{
|
| 3163 |
+
"surface": "hf_space",
|
| 3164 |
+
"kind": "hash_mismatch",
|
| 3165 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3166 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3167 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3168 |
+
},
|
| 3169 |
+
{
|
| 3170 |
+
"surface": "hf_artifacts",
|
| 3171 |
+
"kind": "hash_mismatch",
|
| 3172 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3173 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3174 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3175 |
+
},
|
| 3176 |
+
{
|
| 3177 |
+
"surface": "hf_model",
|
| 3178 |
+
"kind": "hash_mismatch",
|
| 3179 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3180 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3181 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3182 |
+
}
|
| 3183 |
+
]
|
| 3184 |
+
},
|
| 3185 |
+
{
|
| 3186 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3187 |
+
"status": "fail",
|
| 3188 |
+
"local": {
|
| 3189 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3190 |
+
"exists": true,
|
| 3191 |
+
"bytes": 408,
|
| 3192 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3193 |
+
},
|
| 3194 |
+
"mirrors": {
|
| 3195 |
+
"hf_space": {
|
| 3196 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3197 |
+
"exists": true,
|
| 3198 |
+
"bytes": 410,
|
| 3199 |
+
"sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3200 |
+
},
|
| 3201 |
+
"hf_artifacts": {
|
| 3202 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3203 |
+
"exists": true,
|
| 3204 |
+
"bytes": 410,
|
| 3205 |
+
"sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3206 |
+
},
|
| 3207 |
+
"hf_model": {
|
| 3208 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3209 |
+
"exists": true,
|
| 3210 |
+
"bytes": 410,
|
| 3211 |
+
"sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3212 |
+
}
|
| 3213 |
+
},
|
| 3214 |
+
"failures": [
|
| 3215 |
+
{
|
| 3216 |
+
"surface": "hf_space",
|
| 3217 |
+
"kind": "hash_mismatch",
|
| 3218 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3219 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3220 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3221 |
+
},
|
| 3222 |
+
{
|
| 3223 |
+
"surface": "hf_artifacts",
|
| 3224 |
+
"kind": "hash_mismatch",
|
| 3225 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3226 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3227 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3228 |
+
},
|
| 3229 |
+
{
|
| 3230 |
+
"surface": "hf_model",
|
| 3231 |
+
"kind": "hash_mismatch",
|
| 3232 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3233 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3234 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3235 |
+
}
|
| 3236 |
+
]
|
| 3237 |
+
},
|
| 3238 |
+
{
|
| 3239 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3240 |
+
"status": "fail",
|
| 3241 |
+
"local": {
|
| 3242 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3243 |
+
"exists": true,
|
| 3244 |
+
"bytes": 1704,
|
| 3245 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3246 |
+
},
|
| 3247 |
+
"mirrors": {
|
| 3248 |
+
"hf_space": {
|
| 3249 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3250 |
+
"exists": true,
|
| 3251 |
+
"bytes": 1715,
|
| 3252 |
+
"sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3253 |
+
},
|
| 3254 |
+
"hf_artifacts": {
|
| 3255 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3256 |
+
"exists": true,
|
| 3257 |
+
"bytes": 1715,
|
| 3258 |
+
"sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3259 |
+
},
|
| 3260 |
+
"hf_model": {
|
| 3261 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3262 |
+
"exists": true,
|
| 3263 |
+
"bytes": 1715,
|
| 3264 |
+
"sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3265 |
+
}
|
| 3266 |
+
},
|
| 3267 |
+
"failures": [
|
| 3268 |
+
{
|
| 3269 |
+
"surface": "hf_space",
|
| 3270 |
+
"kind": "hash_mismatch",
|
| 3271 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3272 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 3273 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3274 |
+
},
|
| 3275 |
+
{
|
| 3276 |
+
"surface": "hf_artifacts",
|
| 3277 |
+
"kind": "hash_mismatch",
|
| 3278 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3279 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 3280 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3281 |
+
},
|
| 3282 |
+
{
|
| 3283 |
+
"surface": "hf_model",
|
| 3284 |
+
"kind": "hash_mismatch",
|
| 3285 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3286 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 3287 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3288 |
+
}
|
| 3289 |
+
]
|
| 3290 |
+
},
|
| 3291 |
+
{
|
| 3292 |
+
"name": "docs/ARTIFACT_GUIDE.md",
|
| 3293 |
+
"status": "pass",
|
| 3294 |
+
"local": {
|
| 3295 |
+
"path": "repo:ARTIFACT_GUIDE.md",
|
| 3296 |
+
"exists": true,
|
| 3297 |
+
"bytes": 16318,
|
| 3298 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3299 |
+
},
|
| 3300 |
+
"mirrors": {
|
| 3301 |
+
"hf_space": {
|
| 3302 |
+
"path": "hf_space:ARTIFACT_GUIDE.md",
|
| 3303 |
+
"exists": true,
|
| 3304 |
+
"bytes": 16318,
|
| 3305 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3306 |
+
},
|
| 3307 |
+
"hf_artifacts": {
|
| 3308 |
+
"path": "hf_artifacts:ARTIFACT_GUIDE.md",
|
| 3309 |
+
"exists": true,
|
| 3310 |
+
"bytes": 16318,
|
| 3311 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3312 |
+
},
|
| 3313 |
+
"hf_model": {
|
| 3314 |
+
"path": "hf_model:ARTIFACT_GUIDE.md",
|
| 3315 |
+
"exists": true,
|
| 3316 |
+
"bytes": 16318,
|
| 3317 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3318 |
+
}
|
| 3319 |
+
},
|
| 3320 |
+
"failures": []
|
| 3321 |
+
},
|
| 3322 |
+
{
|
| 3323 |
+
"name": "docs/OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3324 |
+
"status": "pass",
|
| 3325 |
+
"local": {
|
| 3326 |
+
"path": "repo:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3327 |
+
"exists": true,
|
| 3328 |
+
"bytes": 8900,
|
| 3329 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3330 |
+
},
|
| 3331 |
+
"mirrors": {
|
| 3332 |
+
"hf_space": {
|
| 3333 |
+
"path": "hf_space:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3334 |
+
"exists": true,
|
| 3335 |
+
"bytes": 8900,
|
| 3336 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3337 |
+
},
|
| 3338 |
+
"hf_artifacts": {
|
| 3339 |
+
"path": "hf_artifacts:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3340 |
+
"exists": true,
|
| 3341 |
+
"bytes": 8900,
|
| 3342 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3343 |
+
},
|
| 3344 |
+
"hf_model": {
|
| 3345 |
+
"path": "hf_model:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3346 |
+
"exists": true,
|
| 3347 |
+
"bytes": 8900,
|
| 3348 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3349 |
+
}
|
| 3350 |
+
},
|
| 3351 |
+
"failures": []
|
| 3352 |
+
},
|
| 3353 |
{
|
| 3354 |
"name": "docs/QUALITY_GATES.md",
|
| 3355 |
"status": "pass",
|
|
|
|
| 3604 |
"local": {
|
| 3605 |
"path": "repo:PROJECT_STATUS.md",
|
| 3606 |
"exists": true,
|
| 3607 |
+
"bytes": 8805,
|
| 3608 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3609 |
},
|
| 3610 |
"mirrors": {
|
| 3611 |
"hf_space": {
|
| 3612 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3613 |
"exists": true,
|
| 3614 |
+
"bytes": 8805,
|
| 3615 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3616 |
},
|
| 3617 |
"hf_artifacts": {
|
| 3618 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3619 |
"exists": true,
|
| 3620 |
+
"bytes": 8805,
|
| 3621 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3622 |
},
|
| 3623 |
"hf_model": {
|
| 3624 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3625 |
"exists": true,
|
| 3626 |
+
"bytes": 8805,
|
| 3627 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3628 |
}
|
| 3629 |
},
|
| 3630 |
"failures": []
|
|
|
|
| 3754 |
"failures": []
|
| 3755 |
}
|
| 3756 |
],
|
| 3757 |
+
"failures": [
|
| 3758 |
+
{
|
| 3759 |
+
"group": "data/artifact_index.json",
|
| 3760 |
+
"surface": "hf_space",
|
| 3761 |
+
"kind": "hash_mismatch",
|
| 3762 |
+
"path": "hf_space:data/artifact_index.json",
|
| 3763 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 3764 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 3765 |
+
},
|
| 3766 |
+
{
|
| 3767 |
+
"group": "data/artifact_index.json",
|
| 3768 |
+
"surface": "hf_artifacts",
|
| 3769 |
+
"kind": "hash_mismatch",
|
| 3770 |
+
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 3771 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 3772 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 3773 |
+
},
|
| 3774 |
+
{
|
| 3775 |
+
"group": "data/artifact_index.json",
|
| 3776 |
+
"surface": "hf_model",
|
| 3777 |
+
"kind": "hash_mismatch",
|
| 3778 |
+
"path": "hf_model:metrics/artifact_index.json",
|
| 3779 |
+
"expected_sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0",
|
| 3780 |
+
"actual_sha256": "2563b854f81b07bfde2880647d0145b511be071a1a274fe1e909ce2be7ce43e1"
|
| 3781 |
+
},
|
| 3782 |
+
{
|
| 3783 |
+
"group": "data/publication_audit.json",
|
| 3784 |
+
"surface": "hf_space",
|
| 3785 |
+
"kind": "hash_mismatch",
|
| 3786 |
+
"path": "hf_space:data/publication_audit.json",
|
| 3787 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 3788 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 3789 |
+
},
|
| 3790 |
+
{
|
| 3791 |
+
"group": "data/publication_audit.json",
|
| 3792 |
+
"surface": "hf_artifacts",
|
| 3793 |
+
"kind": "hash_mismatch",
|
| 3794 |
+
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 3795 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 3796 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 3797 |
+
},
|
| 3798 |
+
{
|
| 3799 |
+
"group": "data/publication_audit.json",
|
| 3800 |
+
"surface": "hf_model",
|
| 3801 |
+
"kind": "hash_mismatch",
|
| 3802 |
+
"path": "hf_model:metrics/publication_audit.json",
|
| 3803 |
+
"expected_sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d",
|
| 3804 |
+
"actual_sha256": "a1741a97c2fb5dee8b9ed8e988b31530128e4fab8b8c458cb8f381e2ad16756c"
|
| 3805 |
+
},
|
| 3806 |
+
{
|
| 3807 |
+
"group": "data/scope_claims_audit.json",
|
| 3808 |
+
"surface": "hf_space",
|
| 3809 |
+
"kind": "hash_mismatch",
|
| 3810 |
+
"path": "hf_space:data/scope_claims_audit.json",
|
| 3811 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 3812 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 3813 |
+
},
|
| 3814 |
+
{
|
| 3815 |
+
"group": "data/scope_claims_audit.json",
|
| 3816 |
+
"surface": "hf_artifacts",
|
| 3817 |
+
"kind": "hash_mismatch",
|
| 3818 |
+
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 3819 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 3820 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 3821 |
+
},
|
| 3822 |
+
{
|
| 3823 |
+
"group": "data/scope_claims_audit.json",
|
| 3824 |
+
"surface": "hf_model",
|
| 3825 |
+
"kind": "hash_mismatch",
|
| 3826 |
+
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 3827 |
+
"expected_sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3",
|
| 3828 |
+
"actual_sha256": "4fb8c088f8ec533b142534b37e9241f8690c8819333434f5d89336c2af8c1c31"
|
| 3829 |
+
},
|
| 3830 |
+
{
|
| 3831 |
+
"group": "data/task_surface_integrity.json",
|
| 3832 |
+
"surface": "hf_space",
|
| 3833 |
+
"kind": "hash_mismatch",
|
| 3834 |
+
"path": "hf_space:data/task_surface_integrity.json",
|
| 3835 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 3836 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 3837 |
+
},
|
| 3838 |
+
{
|
| 3839 |
+
"group": "data/task_surface_integrity.json",
|
| 3840 |
+
"surface": "hf_artifacts",
|
| 3841 |
+
"kind": "hash_mismatch",
|
| 3842 |
+
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 3843 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 3844 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 3845 |
+
},
|
| 3846 |
+
{
|
| 3847 |
+
"group": "data/task_surface_integrity.json",
|
| 3848 |
+
"surface": "hf_model",
|
| 3849 |
+
"kind": "hash_mismatch",
|
| 3850 |
+
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 3851 |
+
"expected_sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6",
|
| 3852 |
+
"actual_sha256": "51c30fe86c558042960e57a252bc6d3c67d95d5a70a8747043a1cdffe57cf53f"
|
| 3853 |
+
},
|
| 3854 |
+
{
|
| 3855 |
+
"group": "data/website_integrity.json",
|
| 3856 |
+
"surface": "hf_space",
|
| 3857 |
+
"kind": "hash_mismatch",
|
| 3858 |
+
"path": "hf_space:data/website_integrity.json",
|
| 3859 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 3860 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 3861 |
+
},
|
| 3862 |
+
{
|
| 3863 |
+
"group": "data/website_integrity.json",
|
| 3864 |
+
"surface": "hf_artifacts",
|
| 3865 |
+
"kind": "hash_mismatch",
|
| 3866 |
+
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 3867 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 3868 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 3869 |
+
},
|
| 3870 |
+
{
|
| 3871 |
+
"group": "data/website_integrity.json",
|
| 3872 |
+
"surface": "hf_model",
|
| 3873 |
+
"kind": "hash_mismatch",
|
| 3874 |
+
"path": "hf_model:metrics/website_integrity.json",
|
| 3875 |
+
"expected_sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2",
|
| 3876 |
+
"actual_sha256": "140d8be179f51351ae55ba7587b7042a1e512e72d9318563a78c96a25e13f830"
|
| 3877 |
+
},
|
| 3878 |
+
{
|
| 3879 |
+
"group": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 3880 |
+
"surface": "hf_artifacts",
|
| 3881 |
+
"kind": "hash_mismatch",
|
| 3882 |
+
"path": "hf_artifacts:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 3883 |
+
"expected_sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337",
|
| 3884 |
+
"actual_sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 3885 |
+
},
|
| 3886 |
+
{
|
| 3887 |
+
"group": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 3888 |
+
"surface": "hf_model",
|
| 3889 |
+
"kind": "hash_mismatch",
|
| 3890 |
+
"path": "hf_model:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 3891 |
+
"expected_sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337",
|
| 3892 |
+
"actual_sha256": "e90ffd4bb75b001ab41cd956dfbb0a99b574d0b5e8ffc1a64e2887490d658daa"
|
| 3893 |
+
},
|
| 3894 |
+
{
|
| 3895 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3896 |
+
"surface": "hf_space",
|
| 3897 |
+
"kind": "hash_mismatch",
|
| 3898 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3899 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3900 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3901 |
+
},
|
| 3902 |
+
{
|
| 3903 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3904 |
+
"surface": "hf_artifacts",
|
| 3905 |
+
"kind": "hash_mismatch",
|
| 3906 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3907 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3908 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3909 |
+
},
|
| 3910 |
+
{
|
| 3911 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3912 |
+
"surface": "hf_model",
|
| 3913 |
+
"kind": "hash_mismatch",
|
| 3914 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 3915 |
+
"expected_sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd",
|
| 3916 |
+
"actual_sha256": "4024fa756edb5a8a9aaac7c213eb411e8d146b109594ad339cc13b08c960bba9"
|
| 3917 |
+
},
|
| 3918 |
+
{
|
| 3919 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3920 |
+
"surface": "hf_space",
|
| 3921 |
+
"kind": "hash_mismatch",
|
| 3922 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3923 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3924 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3925 |
+
},
|
| 3926 |
+
{
|
| 3927 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3928 |
+
"surface": "hf_artifacts",
|
| 3929 |
+
"kind": "hash_mismatch",
|
| 3930 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3931 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3932 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3933 |
+
},
|
| 3934 |
+
{
|
| 3935 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3936 |
+
"surface": "hf_model",
|
| 3937 |
+
"kind": "hash_mismatch",
|
| 3938 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 3939 |
+
"expected_sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430",
|
| 3940 |
+
"actual_sha256": "d995069202708fa456b35aa459ba6e66d90c799b3c4f7b43aa0f6ac4871c986a"
|
| 3941 |
+
},
|
| 3942 |
+
{
|
| 3943 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3944 |
+
"surface": "hf_space",
|
| 3945 |
+
"kind": "hash_mismatch",
|
| 3946 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3947 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3948 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3949 |
+
},
|
| 3950 |
+
{
|
| 3951 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3952 |
+
"surface": "hf_artifacts",
|
| 3953 |
+
"kind": "hash_mismatch",
|
| 3954 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3955 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3956 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3957 |
+
},
|
| 3958 |
+
{
|
| 3959 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3960 |
+
"surface": "hf_model",
|
| 3961 |
+
"kind": "hash_mismatch",
|
| 3962 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 3963 |
+
"expected_sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7",
|
| 3964 |
+
"actual_sha256": "51de0dd0c65d6edc25e78598eebd681fd6ec16ac27de0fa406cc4318023402ad"
|
| 3965 |
+
},
|
| 3966 |
+
{
|
| 3967 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3968 |
+
"surface": "hf_space",
|
| 3969 |
+
"kind": "hash_mismatch",
|
| 3970 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3971 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3972 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3973 |
+
},
|
| 3974 |
+
{
|
| 3975 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3976 |
+
"surface": "hf_artifacts",
|
| 3977 |
+
"kind": "hash_mismatch",
|
| 3978 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3979 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3980 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3981 |
+
},
|
| 3982 |
+
{
|
| 3983 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3984 |
+
"surface": "hf_model",
|
| 3985 |
+
"kind": "hash_mismatch",
|
| 3986 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3987 |
+
"expected_sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7",
|
| 3988 |
+
"actual_sha256": "1bef2f2a709e2b93a01d1cbb43bb483de5bc5b18b25707c796e1df0bab204171"
|
| 3989 |
+
},
|
| 3990 |
+
{
|
| 3991 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3992 |
+
"surface": "hf_space",
|
| 3993 |
+
"kind": "hash_mismatch",
|
| 3994 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3995 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 3996 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 3997 |
+
},
|
| 3998 |
+
{
|
| 3999 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 4000 |
+
"surface": "hf_artifacts",
|
| 4001 |
+
"kind": "hash_mismatch",
|
| 4002 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 4003 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 4004 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 4005 |
+
},
|
| 4006 |
+
{
|
| 4007 |
+
"group": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 4008 |
+
"surface": "hf_model",
|
| 4009 |
+
"kind": "hash_mismatch",
|
| 4010 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 4011 |
+
"expected_sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8",
|
| 4012 |
+
"actual_sha256": "5c0e94caf4fe1eb26565e0bb796cd3c1eed2741b98a528f38317bcfb6a4c2e23"
|
| 4013 |
+
}
|
| 4014 |
+
]
|
| 4015 |
}
|
data/omni_finetune_verified_result.json
CHANGED
|
@@ -67,7 +67,28 @@
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
-
"contains_lora_weights": false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
},
|
| 72 |
"required_next_steps": [
|
| 73 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
|
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
+
"contains_lora_weights": false,
|
| 71 |
+
"error_analysis": {
|
| 72 |
+
"status": "pass",
|
| 73 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 74 |
+
"markdown_report": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 75 |
+
"groupings": [
|
| 76 |
+
"episode",
|
| 77 |
+
"action_family",
|
| 78 |
+
"train_seen_status",
|
| 79 |
+
"required_modality_state",
|
| 80 |
+
"object_category"
|
| 81 |
+
],
|
| 82 |
+
"key_readouts": {
|
| 83 |
+
"parsed_prediction_rate": 0.8772321428571429,
|
| 84 |
+
"weakest_action_family": "locomotion",
|
| 85 |
+
"weakest_action_family_samples": 23,
|
| 86 |
+
"weakest_action_family_parsed_prediction_rate": 0.2608695652173913,
|
| 87 |
+
"seen_action_exact_rate": 0.04580152671755725,
|
| 88 |
+
"unseen_action_exact_rate": 0.015772870662460567,
|
| 89 |
+
"required_modality_state": "rrd_missing_only_required_modalities_present"
|
| 90 |
+
}
|
| 91 |
+
}
|
| 92 |
},
|
| 93 |
"required_next_steps": [
|
| 94 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
data/project_status.json
CHANGED
|
@@ -180,10 +180,12 @@
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
|
|
|
| 183 |
"scripts/omni/package_verified_omni_result.py",
|
| 184 |
-
"scripts/omni/audit_verified_omni_package.py"
|
|
|
|
| 185 |
],
|
| 186 |
-
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows,
|
| 187 |
},
|
| 188 |
{
|
| 189 |
"area": "Raw Xperience-10M redistribution",
|
|
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
| 183 |
+
"results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/",
|
| 184 |
"scripts/omni/package_verified_omni_result.py",
|
| 185 |
+
"scripts/omni/audit_verified_omni_package.py",
|
| 186 |
+
"scripts/omni/analyze_qwen3_omni_errors.py"
|
| 187 |
],
|
| 188 |
+
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows, 448 test predictions, and derived error-analysis tables by episode, action family, train-seen status, required-modality state, and object category. JSON validity is 87.50%, below the 98% target, so it is a diagnostic baseline but not a strong model-quality result."
|
| 189 |
},
|
| 190 |
{
|
| 191 |
"area": "Raw Xperience-10M redistribution",
|
data/publication_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
@@ -182,8 +182,8 @@
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
-
"file_count":
|
| 186 |
-
"text_file_count":
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
@@ -193,8 +193,8 @@
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
-
"file_count":
|
| 197 |
-
"text_file_count":
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
@@ -204,8 +204,8 @@
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
-
"file_count":
|
| 208 |
-
"text_file_count":
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
@@ -215,8 +215,8 @@
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
-
"file_count":
|
| 219 |
-
"text_file_count":
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:02+00:00",
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
+
"file_count": 450,
|
| 186 |
+
"text_file_count": 380,
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
+
"file_count": 363,
|
| 197 |
+
"text_file_count": 293,
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
+
"file_count": 522,
|
| 208 |
+
"text_file_count": 428,
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
+
"file_count": 709,
|
| 219 |
+
"text_file_count": 580,
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
data/scope_claims_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
data/task_surface_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
@@ -64,15 +64,21 @@
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
-
"name": "timeline_action:
|
| 68 |
"status": "pass",
|
| 69 |
-
"value": "
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
-
"name": "timeline_action:
|
| 74 |
"status": "pass",
|
| 75 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
"raw_hits": []
|
| 77 |
},
|
| 78 |
{
|
|
@@ -88,9 +94,9 @@
|
|
| 88 |
"raw_hits": []
|
| 89 |
},
|
| 90 |
{
|
| 91 |
-
"name": "timeline_action:
|
| 92 |
"status": "pass",
|
| 93 |
-
"value": "
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
|
@@ -99,12 +105,6 @@
|
|
| 99 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
-
{
|
| 103 |
-
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 104 |
-
"status": "pass",
|
| 105 |
-
"value": "window features -> action label builder -> classifier",
|
| 106 |
-
"raw_hits": []
|
| 107 |
-
},
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
@@ -184,15 +184,21 @@
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
-
"name": "timeline_subtask:
|
| 188 |
"status": "pass",
|
| 189 |
-
"value": "
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
-
"name": "timeline_subtask:
|
| 194 |
"status": "pass",
|
| 195 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 196 |
"raw_hits": []
|
| 197 |
},
|
| 198 |
{
|
|
@@ -208,9 +214,9 @@
|
|
| 208 |
"raw_hits": []
|
| 209 |
},
|
| 210 |
{
|
| 211 |
-
"name": "timeline_subtask:
|
| 212 |
"status": "pass",
|
| 213 |
-
"value": "
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
|
@@ -219,12 +225,6 @@
|
|
| 219 |
"value": "Predict the higher-level task stage for the current window.",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
-
{
|
| 223 |
-
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 224 |
-
"status": "pass",
|
| 225 |
-
"value": "window features -> subtask label builder -> classifier",
|
| 226 |
-
"raw_hits": []
|
| 227 |
-
},
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
@@ -304,15 +304,21 @@
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
-
"name": "transition_detection:
|
| 308 |
"status": "pass",
|
| 309 |
-
"value": "
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
-
"name": "transition_detection:
|
| 314 |
"status": "pass",
|
| 315 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 316 |
"raw_hits": []
|
| 317 |
},
|
| 318 |
{
|
|
@@ -328,9 +334,9 @@
|
|
| 328 |
"raw_hits": []
|
| 329 |
},
|
| 330 |
{
|
| 331 |
-
"name": "transition_detection:
|
| 332 |
"status": "pass",
|
| 333 |
-
"value": "
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
|
@@ -339,12 +345,6 @@
|
|
| 339 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
-
{
|
| 343 |
-
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 344 |
-
"status": "pass",
|
| 345 |
-
"value": "action changes -> boundary labels -> binary classifier",
|
| 346 |
-
"raw_hits": []
|
| 347 |
-
},
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
@@ -422,15 +422,21 @@
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
-
"name": "next_action:
|
| 426 |
"status": "pass",
|
| 427 |
-
"value": "
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
-
"name": "next_action:
|
| 432 |
"status": "pass",
|
| 433 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 434 |
"raw_hits": []
|
| 435 |
},
|
| 436 |
{
|
|
@@ -446,9 +452,9 @@
|
|
| 446 |
"raw_hits": []
|
| 447 |
},
|
| 448 |
{
|
| 449 |
-
"name": "next_action:
|
| 450 |
"status": "pass",
|
| 451 |
-
"value": "
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
|
@@ -457,12 +463,6 @@
|
|
| 457 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
-
{
|
| 461 |
-
"name": "next_action: public_field_process_short_is_human_readable",
|
| 462 |
-
"status": "pass",
|
| 463 |
-
"value": "current features -> future label shift -> classifier",
|
| 464 |
-
"raw_hits": []
|
| 465 |
-
},
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
@@ -540,15 +540,21 @@
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
-
"name": "hand_trajectory_forecast:
|
| 544 |
"status": "pass",
|
| 545 |
-
"value": "current multimodal
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
-
"name": "hand_trajectory_forecast:
|
| 550 |
"status": "pass",
|
| 551 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 552 |
"raw_hits": []
|
| 553 |
},
|
| 554 |
{
|
|
@@ -564,9 +570,9 @@
|
|
| 564 |
"raw_hits": []
|
| 565 |
},
|
| 566 |
{
|
| 567 |
-
"name": "hand_trajectory_forecast:
|
| 568 |
"status": "pass",
|
| 569 |
-
"value": "
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
|
@@ -575,12 +581,6 @@
|
|
| 575 |
"value": "Predict where the hands will move over the next few frames.",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
-
{
|
| 579 |
-
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 580 |
-
"status": "pass",
|
| 581 |
-
"value": "current features -> future mocap target -> regression head",
|
| 582 |
-
"raw_hits": []
|
| 583 |
-
},
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
@@ -658,15 +658,21 @@
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
-
"name": "contact_prediction:
|
| 662 |
"status": "pass",
|
| 663 |
-
"value": "
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
-
"name": "contact_prediction:
|
| 668 |
"status": "pass",
|
| 669 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 670 |
"raw_hits": []
|
| 671 |
},
|
| 672 |
{
|
|
@@ -682,9 +688,9 @@
|
|
| 682 |
"raw_hits": []
|
| 683 |
},
|
| 684 |
{
|
| 685 |
-
"name": "contact_prediction:
|
| 686 |
"status": "pass",
|
| 687 |
-
"value": "
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
|
@@ -693,12 +699,6 @@
|
|
| 693 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
-
{
|
| 697 |
-
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 698 |
-
"status": "pass",
|
| 699 |
-
"value": "feature filter -> contact target -> binary classifier",
|
| 700 |
-
"raw_hits": []
|
| 701 |
-
},
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
@@ -774,15 +774,21 @@
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
-
"name": "object_relevance:
|
| 778 |
"status": "pass",
|
| 779 |
-
"value": "non-caption
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
-
"name": "object_relevance:
|
| 784 |
"status": "pass",
|
| 785 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 786 |
"raw_hits": []
|
| 787 |
},
|
| 788 |
{
|
|
@@ -798,9 +804,9 @@
|
|
| 798 |
"raw_hits": []
|
| 799 |
},
|
| 800 |
{
|
| 801 |
-
"name": "object_relevance:
|
| 802 |
"status": "pass",
|
| 803 |
-
"value": "
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
|
@@ -809,12 +815,6 @@
|
|
| 809 |
"value": "Predict which objects matter in the current window.",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
-
{
|
| 813 |
-
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 814 |
-
"status": "pass",
|
| 815 |
-
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 816 |
-
"raw_hits": []
|
| 817 |
-
},
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
@@ -892,15 +892,21 @@
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
-
"name": "caption_grounding:
|
| 896 |
"status": "pass",
|
| 897 |
-
"value": "
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
-
"name": "caption_grounding:
|
| 902 |
"status": "pass",
|
| 903 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 904 |
"raw_hits": []
|
| 905 |
},
|
| 906 |
{
|
|
@@ -916,9 +922,9 @@
|
|
| 916 |
"raw_hits": []
|
| 917 |
},
|
| 918 |
{
|
| 919 |
-
"name": "caption_grounding:
|
| 920 |
"status": "pass",
|
| 921 |
-
"value": "
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
|
@@ -927,12 +933,6 @@
|
|
| 927 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
-
{
|
| 931 |
-
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 932 |
-
"status": "pass",
|
| 933 |
-
"value": "query features -> candidate index -> cosine ranker",
|
| 934 |
-
"raw_hits": []
|
| 935 |
-
},
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
@@ -1008,15 +1008,21 @@
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
-
"name": "cross_modal_retrieval:
|
| 1012 |
"status": "pass",
|
| 1013 |
-
"value": "motion
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
-
"name": "cross_modal_retrieval:
|
| 1018 |
"status": "pass",
|
| 1019 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1020 |
"raw_hits": []
|
| 1021 |
},
|
| 1022 |
{
|
|
@@ -1032,9 +1038,9 @@
|
|
| 1032 |
"raw_hits": []
|
| 1033 |
},
|
| 1034 |
{
|
| 1035 |
-
"name": "cross_modal_retrieval:
|
| 1036 |
"status": "pass",
|
| 1037 |
-
"value": "
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
|
@@ -1043,12 +1049,6 @@
|
|
| 1043 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
-
{
|
| 1047 |
-
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1048 |
-
"status": "pass",
|
| 1049 |
-
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1050 |
-
"raw_hits": []
|
| 1051 |
-
},
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
@@ -1126,15 +1126,21 @@
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
-
"name": "modality_reconstruction:
|
| 1130 |
"status": "pass",
|
| 1131 |
-
"value": "motion, IMU, and camera
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
-
"name": "modality_reconstruction:
|
| 1136 |
"status": "pass",
|
| 1137 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1138 |
"raw_hits": []
|
| 1139 |
},
|
| 1140 |
{
|
|
@@ -1150,9 +1156,9 @@
|
|
| 1150 |
"raw_hits": []
|
| 1151 |
},
|
| 1152 |
{
|
| 1153 |
-
"name": "modality_reconstruction:
|
| 1154 |
"status": "pass",
|
| 1155 |
-
"value": "
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
|
@@ -1161,12 +1167,6 @@
|
|
| 1161 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
-
{
|
| 1165 |
-
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1166 |
-
"status": "pass",
|
| 1167 |
-
"value": "source-target split -> scaler -> regression head",
|
| 1168 |
-
"raw_hits": []
|
| 1169 |
-
},
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
@@ -1243,12 +1243,6 @@
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
| 1246 |
-
{
|
| 1247 |
-
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1248 |
-
"status": "pass",
|
| 1249 |
-
"value": "two adjacent windows plus difference vector",
|
| 1250 |
-
"raw_hits": []
|
| 1251 |
-
},
|
| 1252 |
{
|
| 1253 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1254 |
"status": "pass",
|
|
@@ -1256,27 +1250,27 @@
|
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
-
"name": "temporal_order:
|
| 1260 |
"status": "pass",
|
| 1261 |
"value": "Temporal Order Verification",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
-
"name": "temporal_order:
|
| 1266 |
"status": "pass",
|
| 1267 |
-
"value": "
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
-
"name": "temporal_order:
|
| 1272 |
"status": "pass",
|
| 1273 |
"value": "Temporal Order Verification",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
| 1277 |
-
"name": "temporal_order:
|
| 1278 |
"status": "pass",
|
| 1279 |
-
"value": "
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
{
|
|
@@ -1285,6 +1279,12 @@
|
|
| 1285 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1286 |
"raw_hits": []
|
| 1287 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
@@ -1360,15 +1360,21 @@
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
-
"name": "misalignment_detection:
|
| 1364 |
"status": "pass",
|
| 1365 |
-
"value": "motion
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
-
"name": "misalignment_detection:
|
| 1370 |
"status": "pass",
|
| 1371 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1372 |
"raw_hits": []
|
| 1373 |
},
|
| 1374 |
{
|
|
@@ -1384,9 +1390,9 @@
|
|
| 1384 |
"raw_hits": []
|
| 1385 |
},
|
| 1386 |
{
|
| 1387 |
-
"name": "misalignment_detection:
|
| 1388 |
"status": "pass",
|
| 1389 |
-
"value": "
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
|
@@ -1395,12 +1401,6 @@
|
|
| 1395 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
-
{
|
| 1399 |
-
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1400 |
-
"status": "pass",
|
| 1401 |
-
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1402 |
-
"raw_hits": []
|
| 1403 |
-
},
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:59+00:00",
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
+
"name": "timeline_action: public_field_card_blurb_is_human_readable",
|
| 68 |
"status": "pass",
|
| 69 |
+
"value": "Recognize the current manipulation action from synchronized visual, motion, inertial, pose, and annotation context.",
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
+
"name": "timeline_action: public_field_research_name_is_human_readable",
|
| 74 |
"status": "pass",
|
| 75 |
+
"value": "Egocentric Action Recognition",
|
| 76 |
+
"raw_hits": []
|
| 77 |
+
},
|
| 78 |
+
{
|
| 79 |
+
"name": "timeline_action: public_field_input_short_is_human_readable",
|
| 80 |
+
"status": "pass",
|
| 81 |
+
"value": "20-frame multimodal window",
|
| 82 |
"raw_hits": []
|
| 83 |
},
|
| 84 |
{
|
|
|
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
| 97 |
+
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 98 |
"status": "pass",
|
| 99 |
+
"value": "window features -> action label builder -> classifier",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
{
|
|
|
|
| 105 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 106 |
"raw_hits": []
|
| 107 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
+
"name": "timeline_subtask: public_field_card_blurb_is_human_readable",
|
| 188 |
"status": "pass",
|
| 189 |
+
"value": "Recognize the broader activity stage so fine actions become a readable procedure timeline.",
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
+
"name": "timeline_subtask: public_field_research_name_is_human_readable",
|
| 194 |
"status": "pass",
|
| 195 |
+
"value": "Temporal Subtask Recognition",
|
| 196 |
+
"raw_hits": []
|
| 197 |
+
},
|
| 198 |
+
{
|
| 199 |
+
"name": "timeline_subtask: public_field_input_short_is_human_readable",
|
| 200 |
+
"status": "pass",
|
| 201 |
+
"value": "20-frame multimodal window",
|
| 202 |
"raw_hits": []
|
| 203 |
},
|
| 204 |
{
|
|
|
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
| 217 |
+
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 218 |
"status": "pass",
|
| 219 |
+
"value": "window features -> subtask label builder -> classifier",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
{
|
|
|
|
| 225 |
"value": "Predict the higher-level task stage for the current window.",
|
| 226 |
"raw_hits": []
|
| 227 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
+
"name": "transition_detection: public_field_card_blurb_is_human_readable",
|
| 308 |
"status": "pass",
|
| 309 |
+
"value": "Detect the local moment where the episode changes from one action segment to the next.",
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
+
"name": "transition_detection: public_field_research_name_is_human_readable",
|
| 314 |
"status": "pass",
|
| 315 |
+
"value": "Temporal Action Segmentation",
|
| 316 |
+
"raw_hits": []
|
| 317 |
+
},
|
| 318 |
+
{
|
| 319 |
+
"name": "transition_detection: public_field_input_short_is_human_readable",
|
| 320 |
+
"status": "pass",
|
| 321 |
+
"value": "current window with boundary target",
|
| 322 |
"raw_hits": []
|
| 323 |
},
|
| 324 |
{
|
|
|
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
| 337 |
+
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 338 |
"status": "pass",
|
| 339 |
+
"value": "action changes -> boundary labels -> binary classifier",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
{
|
|
|
|
| 345 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 346 |
"raw_hits": []
|
| 347 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
+
"name": "next_action: public_field_card_blurb_is_human_readable",
|
| 426 |
"status": "pass",
|
| 427 |
+
"value": "Forecast the near-future action from the current observations only.",
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
+
"name": "next_action: public_field_research_name_is_human_readable",
|
| 432 |
"status": "pass",
|
| 433 |
+
"value": "Short-Horizon Intention Prediction",
|
| 434 |
+
"raw_hits": []
|
| 435 |
+
},
|
| 436 |
+
{
|
| 437 |
+
"name": "next_action: public_field_input_short_is_human_readable",
|
| 438 |
+
"status": "pass",
|
| 439 |
+
"value": "current window at time t",
|
| 440 |
"raw_hits": []
|
| 441 |
},
|
| 442 |
{
|
|
|
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
| 455 |
+
"name": "next_action: public_field_process_short_is_human_readable",
|
| 456 |
"status": "pass",
|
| 457 |
+
"value": "current features -> future label shift -> classifier",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
{
|
|
|
|
| 463 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 464 |
"raw_hits": []
|
| 465 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
+
"name": "hand_trajectory_forecast: public_field_card_blurb_is_human_readable",
|
| 544 |
"status": "pass",
|
| 545 |
+
"value": "Predict the future 3D left/right hand path from the current multimodal state.",
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
+
"name": "hand_trajectory_forecast: public_field_research_name_is_human_readable",
|
| 550 |
"status": "pass",
|
| 551 |
+
"value": "3D Hand Motion Forecasting",
|
| 552 |
+
"raw_hits": []
|
| 553 |
+
},
|
| 554 |
+
{
|
| 555 |
+
"name": "hand_trajectory_forecast: public_field_input_short_is_human_readable",
|
| 556 |
+
"status": "pass",
|
| 557 |
+
"value": "current multimodal window",
|
| 558 |
"raw_hits": []
|
| 559 |
},
|
| 560 |
{
|
|
|
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
| 573 |
+
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 574 |
"status": "pass",
|
| 575 |
+
"value": "current features -> future mocap target -> regression head",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
{
|
|
|
|
| 581 |
"value": "Predict where the hands will move over the next few frames.",
|
| 582 |
"raw_hits": []
|
| 583 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
+
"name": "contact_prediction: public_field_card_blurb_is_human_readable",
|
| 662 |
"status": "pass",
|
| 663 |
+
"value": "Predict whether body or hand contact with the scene is occurring without leaking contact labels.",
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
+
"name": "contact_prediction: public_field_research_name_is_human_readable",
|
| 668 |
"status": "pass",
|
| 669 |
+
"value": "Human-Object Contact Prediction",
|
| 670 |
+
"raw_hits": []
|
| 671 |
+
},
|
| 672 |
+
{
|
| 673 |
+
"name": "contact_prediction: public_field_input_short_is_human_readable",
|
| 674 |
+
"status": "pass",
|
| 675 |
+
"value": "non-contact, non-caption features",
|
| 676 |
"raw_hits": []
|
| 677 |
},
|
| 678 |
{
|
|
|
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
| 691 |
+
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 692 |
"status": "pass",
|
| 693 |
+
"value": "feature filter -> contact target -> binary classifier",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
{
|
|
|
|
| 699 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 700 |
"raw_hits": []
|
| 701 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
+
"name": "object_relevance: public_field_card_blurb_is_human_readable",
|
| 778 |
"status": "pass",
|
| 779 |
+
"value": "Infer which objects are relevant to the current manipulation window from non-caption features.",
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
+
"name": "object_relevance: public_field_research_name_is_human_readable",
|
| 784 |
"status": "pass",
|
| 785 |
+
"value": "Object-Centric Interaction Recognition",
|
| 786 |
+
"raw_hits": []
|
| 787 |
+
},
|
| 788 |
+
{
|
| 789 |
+
"name": "object_relevance: public_field_input_short_is_human_readable",
|
| 790 |
+
"status": "pass",
|
| 791 |
+
"value": "non-caption multimodal features",
|
| 792 |
"raw_hits": []
|
| 793 |
},
|
| 794 |
{
|
|
|
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
| 807 |
+
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 808 |
"status": "pass",
|
| 809 |
+
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
{
|
|
|
|
| 815 |
"value": "Predict which objects matter in the current window.",
|
| 816 |
"raw_hits": []
|
| 817 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
+
"name": "caption_grounding: public_field_card_blurb_is_human_readable",
|
| 896 |
"status": "pass",
|
| 897 |
+
"value": "Retrieve the matching time window for an annotation-derived text query.",
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
+
"name": "caption_grounding: public_field_research_name_is_human_readable",
|
| 902 |
"status": "pass",
|
| 903 |
+
"value": "Language-to-Moment Grounding",
|
| 904 |
+
"raw_hits": []
|
| 905 |
+
},
|
| 906 |
+
{
|
| 907 |
+
"name": "caption_grounding: public_field_input_short_is_human_readable",
|
| 908 |
+
"status": "pass",
|
| 909 |
+
"value": "text-like query and candidate windows",
|
| 910 |
"raw_hits": []
|
| 911 |
},
|
| 912 |
{
|
|
|
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
| 925 |
+
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 926 |
"status": "pass",
|
| 927 |
+
"value": "query features -> candidate index -> cosine ranker",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
{
|
|
|
|
| 933 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 934 |
"raw_hits": []
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
+
"name": "cross_modal_retrieval: public_field_card_blurb_is_human_readable",
|
| 1012 |
"status": "pass",
|
| 1013 |
+
"value": "Use motion, IMU, and camera-pose signals to retrieve the matching depth/video window.",
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
+
"name": "cross_modal_retrieval: public_field_research_name_is_human_readable",
|
| 1018 |
"status": "pass",
|
| 1019 |
+
"value": "Multimodal Representation Retrieval",
|
| 1020 |
+
"raw_hits": []
|
| 1021 |
+
},
|
| 1022 |
+
{
|
| 1023 |
+
"name": "cross_modal_retrieval: public_field_input_short_is_human_readable",
|
| 1024 |
+
"status": "pass",
|
| 1025 |
+
"value": "motion/IMU/pose query; depth/video candidates",
|
| 1026 |
"raw_hits": []
|
| 1027 |
},
|
| 1028 |
{
|
|
|
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
| 1041 |
+
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1042 |
"status": "pass",
|
| 1043 |
+
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
{
|
|
|
|
| 1049 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1050 |
"raw_hits": []
|
| 1051 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
+
"name": "modality_reconstruction: public_field_card_blurb_is_human_readable",
|
| 1130 |
"status": "pass",
|
| 1131 |
+
"value": "Predict compressed depth/video feature vectors from motion, IMU, and camera-pose features.",
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
+
"name": "modality_reconstruction: public_field_research_name_is_human_readable",
|
| 1136 |
"status": "pass",
|
| 1137 |
+
"value": "Modality Feature Reconstruction",
|
| 1138 |
+
"raw_hits": []
|
| 1139 |
+
},
|
| 1140 |
+
{
|
| 1141 |
+
"name": "modality_reconstruction: public_field_input_short_is_human_readable",
|
| 1142 |
+
"status": "pass",
|
| 1143 |
+
"value": "motion, IMU, and camera/pose features",
|
| 1144 |
"raw_hits": []
|
| 1145 |
},
|
| 1146 |
{
|
|
|
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
| 1159 |
+
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1160 |
"status": "pass",
|
| 1161 |
+
"value": "source-target split -> scaler -> regression head",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
{
|
|
|
|
| 1167 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1168 |
"raw_hits": []
|
| 1169 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1246 |
{
|
| 1247 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1248 |
"status": "pass",
|
|
|
|
| 1250 |
"raw_hits": []
|
| 1251 |
},
|
| 1252 |
{
|
| 1253 |
+
"name": "temporal_order: public_field_research_name_is_human_readable",
|
| 1254 |
"status": "pass",
|
| 1255 |
"value": "Temporal Order Verification",
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
+
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1260 |
"status": "pass",
|
| 1261 |
+
"value": "two adjacent windows plus difference vector",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
+
"name": "temporal_order: public_field_display_name_is_human_readable",
|
| 1266 |
"status": "pass",
|
| 1267 |
"value": "Temporal Order Verification",
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
+
"name": "temporal_order: public_field_output_short_is_human_readable",
|
| 1272 |
"status": "pass",
|
| 1273 |
+
"value": "correct or reversed",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
|
|
|
| 1279 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
+
{
|
| 1283 |
+
"name": "temporal_order: public_field_plain_goal_is_human_readable",
|
| 1284 |
+
"status": "pass",
|
| 1285 |
+
"value": "Tell whether two nearby windows are in the correct time order.",
|
| 1286 |
+
"raw_hits": []
|
| 1287 |
+
},
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
+
"name": "misalignment_detection: public_field_card_blurb_is_human_readable",
|
| 1364 |
"status": "pass",
|
| 1365 |
+
"value": "Detect whether motion and visual/depth streams have been artificially shifted out of sync.",
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
+
"name": "misalignment_detection: public_field_research_name_is_human_readable",
|
| 1370 |
"status": "pass",
|
| 1371 |
+
"value": "Cross-Modal Misalignment Detection",
|
| 1372 |
+
"raw_hits": []
|
| 1373 |
+
},
|
| 1374 |
+
{
|
| 1375 |
+
"name": "misalignment_detection: public_field_input_short_is_human_readable",
|
| 1376 |
+
"status": "pass",
|
| 1377 |
+
"value": "motion-side and visual/depth-side feature groups",
|
| 1378 |
"raw_hits": []
|
| 1379 |
},
|
| 1380 |
{
|
|
|
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
| 1393 |
+
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1394 |
"status": "pass",
|
| 1395 |
+
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
{
|
|
|
|
| 1401 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1402 |
"raw_hits": []
|
| 1403 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
data/website_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
@@ -251,7 +251,7 @@
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
-
"bytes":
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
@@ -291,7 +291,7 @@
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
-
"bytes":
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
@@ -301,7 +301,7 @@
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
-
"bytes":
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
@@ -321,7 +321,7 @@
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
-
"bytes":
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
+
"bytes": 39486,
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
+
"bytes": 126335,
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
+
"bytes": 4142,
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
+
"bytes": 11274,
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
docs/data/artifact_index.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"status": "pass",
|
| 5 |
-
"artifact_count":
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
-
"scaleup_contract":
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
@@ -28,7 +28,7 @@
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
-
"scaleup_status":
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
@@ -63,8 +63,8 @@
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
-
"bytes":
|
| 67 |
-
"sha256": "
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
@@ -74,8 +74,8 @@
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
-
"bytes":
|
| 78 |
-
"sha256": "
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
@@ -187,6 +187,17 @@
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
{
|
| 191 |
"id": "additional_development_directions",
|
| 192 |
"title": "Additional development directions",
|
|
@@ -250,8 +261,8 @@
|
|
| 250 |
"surface": "repo_hf",
|
| 251 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 252 |
"exists": true,
|
| 253 |
-
"bytes":
|
| 254 |
-
"sha256": "
|
| 255 |
},
|
| 256 |
{
|
| 257 |
"id": "official_dataset_card_alignment",
|
|
@@ -695,8 +706,8 @@
|
|
| 695 |
"surface": "repo_hf",
|
| 696 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 697 |
"exists": true,
|
| 698 |
-
"bytes":
|
| 699 |
-
"sha256": "
|
| 700 |
},
|
| 701 |
{
|
| 702 |
"id": "publication_audit",
|
|
@@ -731,7 +742,7 @@
|
|
| 731 |
"volatile": true,
|
| 732 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 733 |
"exists": true,
|
| 734 |
-
"bytes":
|
| 735 |
"hash_policy": "existence_and_size_only"
|
| 736 |
},
|
| 737 |
{
|
|
@@ -933,6 +944,28 @@
|
|
| 933 |
"bytes": 3076,
|
| 934 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"id": "citation",
|
| 938 |
"title": "Citation metadata",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:45+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
+
"artifact_count": 86,
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
+
"scaleup_contract": 7,
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
+
"scaleup_status": 4,
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
+
"bytes": 8805,
|
| 67 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
+
"bytes": 11274,
|
| 78 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
| 190 |
+
{
|
| 191 |
+
"id": "qwen3_omni_error_analysis_script",
|
| 192 |
+
"title": "Qwen3-Omni held-out error-analysis script",
|
| 193 |
+
"path": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 194 |
+
"kind": "scaleup_contract",
|
| 195 |
+
"surface": "repo_hf",
|
| 196 |
+
"shows": "Computes public-safe held-out error-analysis tables by episode, action family, train-seen status, required-modality state, and object category.",
|
| 197 |
+
"exists": true,
|
| 198 |
+
"bytes": 15676,
|
| 199 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 200 |
+
},
|
| 201 |
{
|
| 202 |
"id": "additional_development_directions",
|
| 203 |
"title": "Additional development directions",
|
|
|
|
| 261 |
"surface": "repo_hf",
|
| 262 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 263 |
"exists": true,
|
| 264 |
+
"bytes": 16318,
|
| 265 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 266 |
},
|
| 267 |
{
|
| 268 |
"id": "official_dataset_card_alignment",
|
|
|
|
| 706 |
"surface": "repo_hf",
|
| 707 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 708 |
"exists": true,
|
| 709 |
+
"bytes": 32191,
|
| 710 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 711 |
},
|
| 712 |
{
|
| 713 |
"id": "publication_audit",
|
|
|
|
| 742 |
"volatile": true,
|
| 743 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 744 |
"exists": true,
|
| 745 |
+
"bytes": 126335,
|
| 746 |
"hash_policy": "existence_and_size_only"
|
| 747 |
},
|
| 748 |
{
|
|
|
|
| 944 |
"bytes": 3076,
|
| 945 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 946 |
},
|
| 947 |
+
{
|
| 948 |
+
"id": "qwen3_omni_error_analysis_report",
|
| 949 |
+
"title": "Qwen3-Omni held-out error-analysis report",
|
| 950 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 951 |
+
"kind": "scaleup_status",
|
| 952 |
+
"surface": "repo_hf",
|
| 953 |
+
"shows": "Summarizes validation-aware Qwen3-Omni held-out failures by episode, action family, train-seen status, required-modality state, and object category.",
|
| 954 |
+
"exists": true,
|
| 955 |
+
"bytes": 3331,
|
| 956 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 957 |
+
},
|
| 958 |
+
{
|
| 959 |
+
"id": "qwen3_omni_error_analysis_json",
|
| 960 |
+
"title": "Qwen3-Omni held-out error-analysis JSON",
|
| 961 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 962 |
+
"kind": "scaleup_status",
|
| 963 |
+
"surface": "repo_hf",
|
| 964 |
+
"shows": "Machine-readable Qwen3-Omni held-out error analysis with grouped metrics and sanitized failure examples.",
|
| 965 |
+
"exists": true,
|
| 966 |
+
"bytes": 25202,
|
| 967 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 968 |
+
},
|
| 969 |
{
|
| 970 |
"id": "citation",
|
| 971 |
"title": "Citation metadata",
|
docs/data/mirror_parity.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
-
"group_count":
|
| 7 |
"failure_count": 0,
|
| 8 |
"failures_by_surface": {}
|
| 9 |
},
|
|
@@ -102,27 +102,27 @@
|
|
| 102 |
"local": {
|
| 103 |
"path": "repo:docs/data/artifact_index.json",
|
| 104 |
"exists": true,
|
| 105 |
-
"bytes":
|
| 106 |
-
"sha256": "
|
| 107 |
},
|
| 108 |
"mirrors": {
|
| 109 |
"hf_space": {
|
| 110 |
"path": "hf_space:data/artifact_index.json",
|
| 111 |
"exists": true,
|
| 112 |
-
"bytes":
|
| 113 |
-
"sha256": "
|
| 114 |
},
|
| 115 |
"hf_artifacts": {
|
| 116 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 117 |
"exists": true,
|
| 118 |
-
"bytes":
|
| 119 |
-
"sha256": "
|
| 120 |
},
|
| 121 |
"hf_model": {
|
| 122 |
"path": "hf_model:metrics/artifact_index.json",
|
| 123 |
"exists": true,
|
| 124 |
-
"bytes":
|
| 125 |
-
"sha256": "
|
| 126 |
}
|
| 127 |
},
|
| 128 |
"failures": []
|
|
@@ -350,27 +350,27 @@
|
|
| 350 |
"local": {
|
| 351 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 352 |
"exists": true,
|
| 353 |
-
"bytes":
|
| 354 |
-
"sha256": "
|
| 355 |
},
|
| 356 |
"mirrors": {
|
| 357 |
"hf_space": {
|
| 358 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 359 |
"exists": true,
|
| 360 |
-
"bytes":
|
| 361 |
-
"sha256": "
|
| 362 |
},
|
| 363 |
"hf_artifacts": {
|
| 364 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 365 |
"exists": true,
|
| 366 |
-
"bytes":
|
| 367 |
-
"sha256": "
|
| 368 |
},
|
| 369 |
"hf_model": {
|
| 370 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 371 |
"exists": true,
|
| 372 |
-
"bytes":
|
| 373 |
-
"sha256": "
|
| 374 |
}
|
| 375 |
},
|
| 376 |
"failures": []
|
|
@@ -474,27 +474,27 @@
|
|
| 474 |
"local": {
|
| 475 |
"path": "repo:docs/data/project_status.json",
|
| 476 |
"exists": true,
|
| 477 |
-
"bytes":
|
| 478 |
-
"sha256": "
|
| 479 |
},
|
| 480 |
"mirrors": {
|
| 481 |
"hf_space": {
|
| 482 |
"path": "hf_space:data/project_status.json",
|
| 483 |
"exists": true,
|
| 484 |
-
"bytes":
|
| 485 |
-
"sha256": "
|
| 486 |
},
|
| 487 |
"hf_artifacts": {
|
| 488 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 489 |
"exists": true,
|
| 490 |
-
"bytes":
|
| 491 |
-
"sha256": "
|
| 492 |
},
|
| 493 |
"hf_model": {
|
| 494 |
"path": "hf_model:metrics/project_status.json",
|
| 495 |
"exists": true,
|
| 496 |
-
"bytes":
|
| 497 |
-
"sha256": "
|
| 498 |
}
|
| 499 |
},
|
| 500 |
"failures": []
|
|
@@ -506,26 +506,26 @@
|
|
| 506 |
"path": "repo:docs/data/publication_audit.json",
|
| 507 |
"exists": true,
|
| 508 |
"bytes": 7237,
|
| 509 |
-
"sha256": "
|
| 510 |
},
|
| 511 |
"mirrors": {
|
| 512 |
"hf_space": {
|
| 513 |
"path": "hf_space:data/publication_audit.json",
|
| 514 |
"exists": true,
|
| 515 |
"bytes": 7237,
|
| 516 |
-
"sha256": "
|
| 517 |
},
|
| 518 |
"hf_artifacts": {
|
| 519 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 520 |
"exists": true,
|
| 521 |
"bytes": 7237,
|
| 522 |
-
"sha256": "
|
| 523 |
},
|
| 524 |
"hf_model": {
|
| 525 |
"path": "hf_model:metrics/publication_audit.json",
|
| 526 |
"exists": true,
|
| 527 |
"bytes": 7237,
|
| 528 |
-
"sha256": "
|
| 529 |
}
|
| 530 |
},
|
| 531 |
"failures": []
|
|
@@ -816,26 +816,26 @@
|
|
| 816 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 817 |
"exists": true,
|
| 818 |
"bytes": 20823,
|
| 819 |
-
"sha256": "
|
| 820 |
},
|
| 821 |
"mirrors": {
|
| 822 |
"hf_space": {
|
| 823 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 824 |
"exists": true,
|
| 825 |
"bytes": 20823,
|
| 826 |
-
"sha256": "
|
| 827 |
},
|
| 828 |
"hf_artifacts": {
|
| 829 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 830 |
"exists": true,
|
| 831 |
"bytes": 20823,
|
| 832 |
-
"sha256": "
|
| 833 |
},
|
| 834 |
"hf_model": {
|
| 835 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 836 |
"exists": true,
|
| 837 |
"bytes": 20823,
|
| 838 |
-
"sha256": "
|
| 839 |
}
|
| 840 |
},
|
| 841 |
"failures": []
|
|
@@ -940,26 +940,26 @@
|
|
| 940 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 941 |
"exists": true,
|
| 942 |
"bytes": 45779,
|
| 943 |
-
"sha256": "
|
| 944 |
},
|
| 945 |
"mirrors": {
|
| 946 |
"hf_space": {
|
| 947 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 948 |
"exists": true,
|
| 949 |
"bytes": 45779,
|
| 950 |
-
"sha256": "
|
| 951 |
},
|
| 952 |
"hf_artifacts": {
|
| 953 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 954 |
"exists": true,
|
| 955 |
"bytes": 45779,
|
| 956 |
-
"sha256": "
|
| 957 |
},
|
| 958 |
"hf_model": {
|
| 959 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 960 |
"exists": true,
|
| 961 |
"bytes": 45779,
|
| 962 |
-
"sha256": "
|
| 963 |
}
|
| 964 |
},
|
| 965 |
"failures": []
|
|
@@ -1002,26 +1002,26 @@
|
|
| 1002 |
"path": "repo:docs/data/website_integrity.json",
|
| 1003 |
"exists": true,
|
| 1004 |
"bytes": 15221,
|
| 1005 |
-
"sha256": "
|
| 1006 |
},
|
| 1007 |
"mirrors": {
|
| 1008 |
"hf_space": {
|
| 1009 |
"path": "hf_space:data/website_integrity.json",
|
| 1010 |
"exists": true,
|
| 1011 |
"bytes": 15221,
|
| 1012 |
-
"sha256": "
|
| 1013 |
},
|
| 1014 |
"hf_artifacts": {
|
| 1015 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1016 |
"exists": true,
|
| 1017 |
"bytes": 15221,
|
| 1018 |
-
"sha256": "
|
| 1019 |
},
|
| 1020 |
"hf_model": {
|
| 1021 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1022 |
"exists": true,
|
| 1023 |
"bytes": 15221,
|
| 1024 |
-
"sha256": "
|
| 1025 |
}
|
| 1026 |
},
|
| 1027 |
"failures": []
|
|
@@ -1723,6 +1723,31 @@
|
|
| 1723 |
},
|
| 1724 |
"failures": []
|
| 1725 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1726 |
{
|
| 1727 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1728 |
"status": "pass",
|
|
@@ -1754,21 +1779,21 @@
|
|
| 1754 |
"local": {
|
| 1755 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1756 |
"exists": true,
|
| 1757 |
-
"bytes":
|
| 1758 |
-
"sha256": "
|
| 1759 |
},
|
| 1760 |
"mirrors": {
|
| 1761 |
"hf_artifacts": {
|
| 1762 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1763 |
"exists": true,
|
| 1764 |
-
"bytes":
|
| 1765 |
-
"sha256": "
|
| 1766 |
},
|
| 1767 |
"hf_model": {
|
| 1768 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1769 |
"exists": true,
|
| 1770 |
-
"bytes":
|
| 1771 |
-
"sha256": "
|
| 1772 |
}
|
| 1773 |
},
|
| 1774 |
"failures": []
|
|
@@ -2054,21 +2079,21 @@
|
|
| 2054 |
"local": {
|
| 2055 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2056 |
"exists": true,
|
| 2057 |
-
"bytes":
|
| 2058 |
-
"sha256": "
|
| 2059 |
},
|
| 2060 |
"mirrors": {
|
| 2061 |
"hf_artifacts": {
|
| 2062 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2063 |
"exists": true,
|
| 2064 |
-
"bytes":
|
| 2065 |
-
"sha256": "
|
| 2066 |
},
|
| 2067 |
"hf_model": {
|
| 2068 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2069 |
"exists": true,
|
| 2070 |
-
"bytes":
|
| 2071 |
-
"sha256": "
|
| 2072 |
}
|
| 2073 |
},
|
| 2074 |
"failures": []
|
|
@@ -2807,6 +2832,285 @@
|
|
| 2807 |
},
|
| 2808 |
"failures": []
|
| 2809 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2810 |
{
|
| 2811 |
"name": "docs/QUALITY_GATES.md",
|
| 2812 |
"status": "pass",
|
|
@@ -3061,27 +3365,27 @@
|
|
| 3061 |
"local": {
|
| 3062 |
"path": "repo:PROJECT_STATUS.md",
|
| 3063 |
"exists": true,
|
| 3064 |
-
"bytes":
|
| 3065 |
-
"sha256": "
|
| 3066 |
},
|
| 3067 |
"mirrors": {
|
| 3068 |
"hf_space": {
|
| 3069 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3070 |
"exists": true,
|
| 3071 |
-
"bytes":
|
| 3072 |
-
"sha256": "
|
| 3073 |
},
|
| 3074 |
"hf_artifacts": {
|
| 3075 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3076 |
"exists": true,
|
| 3077 |
-
"bytes":
|
| 3078 |
-
"sha256": "
|
| 3079 |
},
|
| 3080 |
"hf_model": {
|
| 3081 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3082 |
"exists": true,
|
| 3083 |
-
"bytes":
|
| 3084 |
-
"sha256": "
|
| 3085 |
}
|
| 3086 |
},
|
| 3087 |
"failures": []
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:56:44+00:00",
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
+
"group_count": 114,
|
| 7 |
"failure_count": 0,
|
| 8 |
"failures_by_surface": {}
|
| 9 |
},
|
|
|
|
| 102 |
"local": {
|
| 103 |
"path": "repo:docs/data/artifact_index.json",
|
| 104 |
"exists": true,
|
| 105 |
+
"bytes": 39486,
|
| 106 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 107 |
},
|
| 108 |
"mirrors": {
|
| 109 |
"hf_space": {
|
| 110 |
"path": "hf_space:data/artifact_index.json",
|
| 111 |
"exists": true,
|
| 112 |
+
"bytes": 39486,
|
| 113 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 114 |
},
|
| 115 |
"hf_artifacts": {
|
| 116 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 117 |
"exists": true,
|
| 118 |
+
"bytes": 39486,
|
| 119 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 120 |
},
|
| 121 |
"hf_model": {
|
| 122 |
"path": "hf_model:metrics/artifact_index.json",
|
| 123 |
"exists": true,
|
| 124 |
+
"bytes": 39486,
|
| 125 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 126 |
}
|
| 127 |
},
|
| 128 |
"failures": []
|
|
|
|
| 350 |
"local": {
|
| 351 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 352 |
"exists": true,
|
| 353 |
+
"bytes": 4142,
|
| 354 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 355 |
},
|
| 356 |
"mirrors": {
|
| 357 |
"hf_space": {
|
| 358 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 359 |
"exists": true,
|
| 360 |
+
"bytes": 4142,
|
| 361 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 362 |
},
|
| 363 |
"hf_artifacts": {
|
| 364 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 365 |
"exists": true,
|
| 366 |
+
"bytes": 4142,
|
| 367 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 368 |
},
|
| 369 |
"hf_model": {
|
| 370 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 371 |
"exists": true,
|
| 372 |
+
"bytes": 4142,
|
| 373 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 374 |
}
|
| 375 |
},
|
| 376 |
"failures": []
|
|
|
|
| 474 |
"local": {
|
| 475 |
"path": "repo:docs/data/project_status.json",
|
| 476 |
"exists": true,
|
| 477 |
+
"bytes": 11274,
|
| 478 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 479 |
},
|
| 480 |
"mirrors": {
|
| 481 |
"hf_space": {
|
| 482 |
"path": "hf_space:data/project_status.json",
|
| 483 |
"exists": true,
|
| 484 |
+
"bytes": 11274,
|
| 485 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 486 |
},
|
| 487 |
"hf_artifacts": {
|
| 488 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 489 |
"exists": true,
|
| 490 |
+
"bytes": 11274,
|
| 491 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 492 |
},
|
| 493 |
"hf_model": {
|
| 494 |
"path": "hf_model:metrics/project_status.json",
|
| 495 |
"exists": true,
|
| 496 |
+
"bytes": 11274,
|
| 497 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 498 |
}
|
| 499 |
},
|
| 500 |
"failures": []
|
|
|
|
| 506 |
"path": "repo:docs/data/publication_audit.json",
|
| 507 |
"exists": true,
|
| 508 |
"bytes": 7237,
|
| 509 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 510 |
},
|
| 511 |
"mirrors": {
|
| 512 |
"hf_space": {
|
| 513 |
"path": "hf_space:data/publication_audit.json",
|
| 514 |
"exists": true,
|
| 515 |
"bytes": 7237,
|
| 516 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 517 |
},
|
| 518 |
"hf_artifacts": {
|
| 519 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 520 |
"exists": true,
|
| 521 |
"bytes": 7237,
|
| 522 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 523 |
},
|
| 524 |
"hf_model": {
|
| 525 |
"path": "hf_model:metrics/publication_audit.json",
|
| 526 |
"exists": true,
|
| 527 |
"bytes": 7237,
|
| 528 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 529 |
}
|
| 530 |
},
|
| 531 |
"failures": []
|
|
|
|
| 816 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 817 |
"exists": true,
|
| 818 |
"bytes": 20823,
|
| 819 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 820 |
},
|
| 821 |
"mirrors": {
|
| 822 |
"hf_space": {
|
| 823 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 824 |
"exists": true,
|
| 825 |
"bytes": 20823,
|
| 826 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 827 |
},
|
| 828 |
"hf_artifacts": {
|
| 829 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 830 |
"exists": true,
|
| 831 |
"bytes": 20823,
|
| 832 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 833 |
},
|
| 834 |
"hf_model": {
|
| 835 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 836 |
"exists": true,
|
| 837 |
"bytes": 20823,
|
| 838 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 839 |
}
|
| 840 |
},
|
| 841 |
"failures": []
|
|
|
|
| 940 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 941 |
"exists": true,
|
| 942 |
"bytes": 45779,
|
| 943 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 944 |
},
|
| 945 |
"mirrors": {
|
| 946 |
"hf_space": {
|
| 947 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 948 |
"exists": true,
|
| 949 |
"bytes": 45779,
|
| 950 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 951 |
},
|
| 952 |
"hf_artifacts": {
|
| 953 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 954 |
"exists": true,
|
| 955 |
"bytes": 45779,
|
| 956 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 957 |
},
|
| 958 |
"hf_model": {
|
| 959 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 960 |
"exists": true,
|
| 961 |
"bytes": 45779,
|
| 962 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 963 |
}
|
| 964 |
},
|
| 965 |
"failures": []
|
|
|
|
| 1002 |
"path": "repo:docs/data/website_integrity.json",
|
| 1003 |
"exists": true,
|
| 1004 |
"bytes": 15221,
|
| 1005 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1006 |
},
|
| 1007 |
"mirrors": {
|
| 1008 |
"hf_space": {
|
| 1009 |
"path": "hf_space:data/website_integrity.json",
|
| 1010 |
"exists": true,
|
| 1011 |
"bytes": 15221,
|
| 1012 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1013 |
},
|
| 1014 |
"hf_artifacts": {
|
| 1015 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1016 |
"exists": true,
|
| 1017 |
"bytes": 15221,
|
| 1018 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1019 |
},
|
| 1020 |
"hf_model": {
|
| 1021 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1022 |
"exists": true,
|
| 1023 |
"bytes": 15221,
|
| 1024 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1025 |
}
|
| 1026 |
},
|
| 1027 |
"failures": []
|
|
|
|
| 1723 |
},
|
| 1724 |
"failures": []
|
| 1725 |
},
|
| 1726 |
+
{
|
| 1727 |
+
"name": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1728 |
+
"status": "pass",
|
| 1729 |
+
"local": {
|
| 1730 |
+
"path": "repo:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1731 |
+
"exists": true,
|
| 1732 |
+
"bytes": 15676,
|
| 1733 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1734 |
+
},
|
| 1735 |
+
"mirrors": {
|
| 1736 |
+
"hf_artifacts": {
|
| 1737 |
+
"path": "hf_artifacts:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1738 |
+
"exists": true,
|
| 1739 |
+
"bytes": 15676,
|
| 1740 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1741 |
+
},
|
| 1742 |
+
"hf_model": {
|
| 1743 |
+
"path": "hf_model:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1744 |
+
"exists": true,
|
| 1745 |
+
"bytes": 15676,
|
| 1746 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1747 |
+
}
|
| 1748 |
+
},
|
| 1749 |
+
"failures": []
|
| 1750 |
+
},
|
| 1751 |
{
|
| 1752 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1753 |
"status": "pass",
|
|
|
|
| 1779 |
"local": {
|
| 1780 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1781 |
"exists": true,
|
| 1782 |
+
"bytes": 32191,
|
| 1783 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1784 |
},
|
| 1785 |
"mirrors": {
|
| 1786 |
"hf_artifacts": {
|
| 1787 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1788 |
"exists": true,
|
| 1789 |
+
"bytes": 32191,
|
| 1790 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1791 |
},
|
| 1792 |
"hf_model": {
|
| 1793 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1794 |
"exists": true,
|
| 1795 |
+
"bytes": 32191,
|
| 1796 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1797 |
}
|
| 1798 |
},
|
| 1799 |
"failures": []
|
|
|
|
| 2079 |
"local": {
|
| 2080 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2081 |
"exists": true,
|
| 2082 |
+
"bytes": 13781,
|
| 2083 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2084 |
},
|
| 2085 |
"mirrors": {
|
| 2086 |
"hf_artifacts": {
|
| 2087 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2088 |
"exists": true,
|
| 2089 |
+
"bytes": 13781,
|
| 2090 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2091 |
},
|
| 2092 |
"hf_model": {
|
| 2093 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2094 |
"exists": true,
|
| 2095 |
+
"bytes": 13781,
|
| 2096 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2097 |
}
|
| 2098 |
},
|
| 2099 |
"failures": []
|
|
|
|
| 2832 |
},
|
| 2833 |
"failures": []
|
| 2834 |
},
|
| 2835 |
+
{
|
| 2836 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2837 |
+
"status": "pass",
|
| 2838 |
+
"local": {
|
| 2839 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2840 |
+
"exists": true,
|
| 2841 |
+
"bytes": 3331,
|
| 2842 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2843 |
+
},
|
| 2844 |
+
"mirrors": {
|
| 2845 |
+
"hf_space": {
|
| 2846 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2847 |
+
"exists": true,
|
| 2848 |
+
"bytes": 3331,
|
| 2849 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2850 |
+
},
|
| 2851 |
+
"hf_artifacts": {
|
| 2852 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2853 |
+
"exists": true,
|
| 2854 |
+
"bytes": 3331,
|
| 2855 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2856 |
+
},
|
| 2857 |
+
"hf_model": {
|
| 2858 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2859 |
+
"exists": true,
|
| 2860 |
+
"bytes": 3331,
|
| 2861 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2862 |
+
}
|
| 2863 |
+
},
|
| 2864 |
+
"failures": []
|
| 2865 |
+
},
|
| 2866 |
+
{
|
| 2867 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2868 |
+
"status": "pass",
|
| 2869 |
+
"local": {
|
| 2870 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2871 |
+
"exists": true,
|
| 2872 |
+
"bytes": 25202,
|
| 2873 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2874 |
+
},
|
| 2875 |
+
"mirrors": {
|
| 2876 |
+
"hf_space": {
|
| 2877 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2878 |
+
"exists": true,
|
| 2879 |
+
"bytes": 25202,
|
| 2880 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2881 |
+
},
|
| 2882 |
+
"hf_artifacts": {
|
| 2883 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2884 |
+
"exists": true,
|
| 2885 |
+
"bytes": 25202,
|
| 2886 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2887 |
+
},
|
| 2888 |
+
"hf_model": {
|
| 2889 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2890 |
+
"exists": true,
|
| 2891 |
+
"bytes": 25202,
|
| 2892 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2893 |
+
}
|
| 2894 |
+
},
|
| 2895 |
+
"failures": []
|
| 2896 |
+
},
|
| 2897 |
+
{
|
| 2898 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2899 |
+
"status": "pass",
|
| 2900 |
+
"local": {
|
| 2901 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2902 |
+
"exists": true,
|
| 2903 |
+
"bytes": 2121,
|
| 2904 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2905 |
+
},
|
| 2906 |
+
"mirrors": {
|
| 2907 |
+
"hf_space": {
|
| 2908 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2909 |
+
"exists": true,
|
| 2910 |
+
"bytes": 2121,
|
| 2911 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2912 |
+
},
|
| 2913 |
+
"hf_artifacts": {
|
| 2914 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2915 |
+
"exists": true,
|
| 2916 |
+
"bytes": 2121,
|
| 2917 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2918 |
+
},
|
| 2919 |
+
"hf_model": {
|
| 2920 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2921 |
+
"exists": true,
|
| 2922 |
+
"bytes": 2121,
|
| 2923 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2924 |
+
}
|
| 2925 |
+
},
|
| 2926 |
+
"failures": []
|
| 2927 |
+
},
|
| 2928 |
+
{
|
| 2929 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2930 |
+
"status": "pass",
|
| 2931 |
+
"local": {
|
| 2932 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2933 |
+
"exists": true,
|
| 2934 |
+
"bytes": 1320,
|
| 2935 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2936 |
+
},
|
| 2937 |
+
"mirrors": {
|
| 2938 |
+
"hf_space": {
|
| 2939 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2940 |
+
"exists": true,
|
| 2941 |
+
"bytes": 1320,
|
| 2942 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2943 |
+
},
|
| 2944 |
+
"hf_artifacts": {
|
| 2945 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2946 |
+
"exists": true,
|
| 2947 |
+
"bytes": 1320,
|
| 2948 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2949 |
+
},
|
| 2950 |
+
"hf_model": {
|
| 2951 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2952 |
+
"exists": true,
|
| 2953 |
+
"bytes": 1320,
|
| 2954 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2955 |
+
}
|
| 2956 |
+
},
|
| 2957 |
+
"failures": []
|
| 2958 |
+
},
|
| 2959 |
+
{
|
| 2960 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2961 |
+
"status": "pass",
|
| 2962 |
+
"local": {
|
| 2963 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2964 |
+
"exists": true,
|
| 2965 |
+
"bytes": 572,
|
| 2966 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2967 |
+
},
|
| 2968 |
+
"mirrors": {
|
| 2969 |
+
"hf_space": {
|
| 2970 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2971 |
+
"exists": true,
|
| 2972 |
+
"bytes": 572,
|
| 2973 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2974 |
+
},
|
| 2975 |
+
"hf_artifacts": {
|
| 2976 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2977 |
+
"exists": true,
|
| 2978 |
+
"bytes": 572,
|
| 2979 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2980 |
+
},
|
| 2981 |
+
"hf_model": {
|
| 2982 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2983 |
+
"exists": true,
|
| 2984 |
+
"bytes": 572,
|
| 2985 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2986 |
+
}
|
| 2987 |
+
},
|
| 2988 |
+
"failures": []
|
| 2989 |
+
},
|
| 2990 |
+
{
|
| 2991 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 2992 |
+
"status": "pass",
|
| 2993 |
+
"local": {
|
| 2994 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 2995 |
+
"exists": true,
|
| 2996 |
+
"bytes": 408,
|
| 2997 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 2998 |
+
},
|
| 2999 |
+
"mirrors": {
|
| 3000 |
+
"hf_space": {
|
| 3001 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3002 |
+
"exists": true,
|
| 3003 |
+
"bytes": 408,
|
| 3004 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3005 |
+
},
|
| 3006 |
+
"hf_artifacts": {
|
| 3007 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3008 |
+
"exists": true,
|
| 3009 |
+
"bytes": 408,
|
| 3010 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3011 |
+
},
|
| 3012 |
+
"hf_model": {
|
| 3013 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3014 |
+
"exists": true,
|
| 3015 |
+
"bytes": 408,
|
| 3016 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3017 |
+
}
|
| 3018 |
+
},
|
| 3019 |
+
"failures": []
|
| 3020 |
+
},
|
| 3021 |
+
{
|
| 3022 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3023 |
+
"status": "pass",
|
| 3024 |
+
"local": {
|
| 3025 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3026 |
+
"exists": true,
|
| 3027 |
+
"bytes": 1704,
|
| 3028 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3029 |
+
},
|
| 3030 |
+
"mirrors": {
|
| 3031 |
+
"hf_space": {
|
| 3032 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3033 |
+
"exists": true,
|
| 3034 |
+
"bytes": 1704,
|
| 3035 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3036 |
+
},
|
| 3037 |
+
"hf_artifacts": {
|
| 3038 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3039 |
+
"exists": true,
|
| 3040 |
+
"bytes": 1704,
|
| 3041 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3042 |
+
},
|
| 3043 |
+
"hf_model": {
|
| 3044 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3045 |
+
"exists": true,
|
| 3046 |
+
"bytes": 1704,
|
| 3047 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3048 |
+
}
|
| 3049 |
+
},
|
| 3050 |
+
"failures": []
|
| 3051 |
+
},
|
| 3052 |
+
{
|
| 3053 |
+
"name": "docs/ARTIFACT_GUIDE.md",
|
| 3054 |
+
"status": "pass",
|
| 3055 |
+
"local": {
|
| 3056 |
+
"path": "repo:ARTIFACT_GUIDE.md",
|
| 3057 |
+
"exists": true,
|
| 3058 |
+
"bytes": 16318,
|
| 3059 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3060 |
+
},
|
| 3061 |
+
"mirrors": {
|
| 3062 |
+
"hf_space": {
|
| 3063 |
+
"path": "hf_space:ARTIFACT_GUIDE.md",
|
| 3064 |
+
"exists": true,
|
| 3065 |
+
"bytes": 16318,
|
| 3066 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3067 |
+
},
|
| 3068 |
+
"hf_artifacts": {
|
| 3069 |
+
"path": "hf_artifacts:ARTIFACT_GUIDE.md",
|
| 3070 |
+
"exists": true,
|
| 3071 |
+
"bytes": 16318,
|
| 3072 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3073 |
+
},
|
| 3074 |
+
"hf_model": {
|
| 3075 |
+
"path": "hf_model:ARTIFACT_GUIDE.md",
|
| 3076 |
+
"exists": true,
|
| 3077 |
+
"bytes": 16318,
|
| 3078 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3079 |
+
}
|
| 3080 |
+
},
|
| 3081 |
+
"failures": []
|
| 3082 |
+
},
|
| 3083 |
+
{
|
| 3084 |
+
"name": "docs/OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3085 |
+
"status": "pass",
|
| 3086 |
+
"local": {
|
| 3087 |
+
"path": "repo:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3088 |
+
"exists": true,
|
| 3089 |
+
"bytes": 8900,
|
| 3090 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3091 |
+
},
|
| 3092 |
+
"mirrors": {
|
| 3093 |
+
"hf_space": {
|
| 3094 |
+
"path": "hf_space:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3095 |
+
"exists": true,
|
| 3096 |
+
"bytes": 8900,
|
| 3097 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3098 |
+
},
|
| 3099 |
+
"hf_artifacts": {
|
| 3100 |
+
"path": "hf_artifacts:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3101 |
+
"exists": true,
|
| 3102 |
+
"bytes": 8900,
|
| 3103 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3104 |
+
},
|
| 3105 |
+
"hf_model": {
|
| 3106 |
+
"path": "hf_model:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3107 |
+
"exists": true,
|
| 3108 |
+
"bytes": 8900,
|
| 3109 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3110 |
+
}
|
| 3111 |
+
},
|
| 3112 |
+
"failures": []
|
| 3113 |
+
},
|
| 3114 |
{
|
| 3115 |
"name": "docs/QUALITY_GATES.md",
|
| 3116 |
"status": "pass",
|
|
|
|
| 3365 |
"local": {
|
| 3366 |
"path": "repo:PROJECT_STATUS.md",
|
| 3367 |
"exists": true,
|
| 3368 |
+
"bytes": 8805,
|
| 3369 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3370 |
},
|
| 3371 |
"mirrors": {
|
| 3372 |
"hf_space": {
|
| 3373 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3374 |
"exists": true,
|
| 3375 |
+
"bytes": 8805,
|
| 3376 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3377 |
},
|
| 3378 |
"hf_artifacts": {
|
| 3379 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3380 |
"exists": true,
|
| 3381 |
+
"bytes": 8805,
|
| 3382 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3383 |
},
|
| 3384 |
"hf_model": {
|
| 3385 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3386 |
"exists": true,
|
| 3387 |
+
"bytes": 8805,
|
| 3388 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3389 |
}
|
| 3390 |
},
|
| 3391 |
"failures": []
|
docs/data/omni_finetune_verified_result.json
CHANGED
|
@@ -67,7 +67,28 @@
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
-
"contains_lora_weights": false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
},
|
| 72 |
"required_next_steps": [
|
| 73 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
|
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
+
"contains_lora_weights": false,
|
| 71 |
+
"error_analysis": {
|
| 72 |
+
"status": "pass",
|
| 73 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 74 |
+
"markdown_report": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 75 |
+
"groupings": [
|
| 76 |
+
"episode",
|
| 77 |
+
"action_family",
|
| 78 |
+
"train_seen_status",
|
| 79 |
+
"required_modality_state",
|
| 80 |
+
"object_category"
|
| 81 |
+
],
|
| 82 |
+
"key_readouts": {
|
| 83 |
+
"parsed_prediction_rate": 0.8772321428571429,
|
| 84 |
+
"weakest_action_family": "locomotion",
|
| 85 |
+
"weakest_action_family_samples": 23,
|
| 86 |
+
"weakest_action_family_parsed_prediction_rate": 0.2608695652173913,
|
| 87 |
+
"seen_action_exact_rate": 0.04580152671755725,
|
| 88 |
+
"unseen_action_exact_rate": 0.015772870662460567,
|
| 89 |
+
"required_modality_state": "rrd_missing_only_required_modalities_present"
|
| 90 |
+
}
|
| 91 |
+
}
|
| 92 |
},
|
| 93 |
"required_next_steps": [
|
| 94 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
docs/data/project_status.json
CHANGED
|
@@ -180,10 +180,12 @@
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
|
|
|
| 183 |
"scripts/omni/package_verified_omni_result.py",
|
| 184 |
-
"scripts/omni/audit_verified_omni_package.py"
|
|
|
|
| 185 |
],
|
| 186 |
-
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows,
|
| 187 |
},
|
| 188 |
{
|
| 189 |
"area": "Raw Xperience-10M redistribution",
|
|
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
| 183 |
+
"results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/",
|
| 184 |
"scripts/omni/package_verified_omni_result.py",
|
| 185 |
+
"scripts/omni/audit_verified_omni_package.py",
|
| 186 |
+
"scripts/omni/analyze_qwen3_omni_errors.py"
|
| 187 |
],
|
| 188 |
+
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows, 448 test predictions, and derived error-analysis tables by episode, action family, train-seen status, required-modality state, and object category. JSON validity is 87.50%, below the 98% target, so it is a diagnostic baseline but not a strong model-quality result."
|
| 189 |
},
|
| 190 |
{
|
| 191 |
"area": "Raw Xperience-10M redistribution",
|
docs/data/publication_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
@@ -182,8 +182,8 @@
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
-
"file_count":
|
| 186 |
-
"text_file_count":
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
@@ -193,8 +193,8 @@
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
-
"file_count":
|
| 197 |
-
"text_file_count":
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
@@ -204,8 +204,8 @@
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
-
"file_count":
|
| 208 |
-
"text_file_count":
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
@@ -215,8 +215,8 @@
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
-
"file_count":
|
| 219 |
-
"text_file_count":
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:02+00:00",
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
+
"file_count": 450,
|
| 186 |
+
"text_file_count": 380,
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
+
"file_count": 363,
|
| 197 |
+
"text_file_count": 293,
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
+
"file_count": 522,
|
| 208 |
+
"text_file_count": 428,
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
+
"file_count": 709,
|
| 219 |
+
"text_file_count": 580,
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
docs/data/scope_claims_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
docs/data/task_surface_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
@@ -64,15 +64,21 @@
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
-
"name": "timeline_action:
|
| 68 |
"status": "pass",
|
| 69 |
-
"value": "
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
-
"name": "timeline_action:
|
| 74 |
"status": "pass",
|
| 75 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
"raw_hits": []
|
| 77 |
},
|
| 78 |
{
|
|
@@ -88,9 +94,9 @@
|
|
| 88 |
"raw_hits": []
|
| 89 |
},
|
| 90 |
{
|
| 91 |
-
"name": "timeline_action:
|
| 92 |
"status": "pass",
|
| 93 |
-
"value": "
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
|
@@ -99,12 +105,6 @@
|
|
| 99 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
-
{
|
| 103 |
-
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 104 |
-
"status": "pass",
|
| 105 |
-
"value": "window features -> action label builder -> classifier",
|
| 106 |
-
"raw_hits": []
|
| 107 |
-
},
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
@@ -184,15 +184,21 @@
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
-
"name": "timeline_subtask:
|
| 188 |
"status": "pass",
|
| 189 |
-
"value": "
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
-
"name": "timeline_subtask:
|
| 194 |
"status": "pass",
|
| 195 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 196 |
"raw_hits": []
|
| 197 |
},
|
| 198 |
{
|
|
@@ -208,9 +214,9 @@
|
|
| 208 |
"raw_hits": []
|
| 209 |
},
|
| 210 |
{
|
| 211 |
-
"name": "timeline_subtask:
|
| 212 |
"status": "pass",
|
| 213 |
-
"value": "
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
|
@@ -219,12 +225,6 @@
|
|
| 219 |
"value": "Predict the higher-level task stage for the current window.",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
-
{
|
| 223 |
-
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 224 |
-
"status": "pass",
|
| 225 |
-
"value": "window features -> subtask label builder -> classifier",
|
| 226 |
-
"raw_hits": []
|
| 227 |
-
},
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
@@ -304,15 +304,21 @@
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
-
"name": "transition_detection:
|
| 308 |
"status": "pass",
|
| 309 |
-
"value": "
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
-
"name": "transition_detection:
|
| 314 |
"status": "pass",
|
| 315 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 316 |
"raw_hits": []
|
| 317 |
},
|
| 318 |
{
|
|
@@ -328,9 +334,9 @@
|
|
| 328 |
"raw_hits": []
|
| 329 |
},
|
| 330 |
{
|
| 331 |
-
"name": "transition_detection:
|
| 332 |
"status": "pass",
|
| 333 |
-
"value": "
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
|
@@ -339,12 +345,6 @@
|
|
| 339 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
-
{
|
| 343 |
-
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 344 |
-
"status": "pass",
|
| 345 |
-
"value": "action changes -> boundary labels -> binary classifier",
|
| 346 |
-
"raw_hits": []
|
| 347 |
-
},
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
@@ -422,15 +422,21 @@
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
-
"name": "next_action:
|
| 426 |
"status": "pass",
|
| 427 |
-
"value": "
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
-
"name": "next_action:
|
| 432 |
"status": "pass",
|
| 433 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 434 |
"raw_hits": []
|
| 435 |
},
|
| 436 |
{
|
|
@@ -446,9 +452,9 @@
|
|
| 446 |
"raw_hits": []
|
| 447 |
},
|
| 448 |
{
|
| 449 |
-
"name": "next_action:
|
| 450 |
"status": "pass",
|
| 451 |
-
"value": "
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
|
@@ -457,12 +463,6 @@
|
|
| 457 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
-
{
|
| 461 |
-
"name": "next_action: public_field_process_short_is_human_readable",
|
| 462 |
-
"status": "pass",
|
| 463 |
-
"value": "current features -> future label shift -> classifier",
|
| 464 |
-
"raw_hits": []
|
| 465 |
-
},
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
@@ -540,15 +540,21 @@
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
-
"name": "hand_trajectory_forecast:
|
| 544 |
"status": "pass",
|
| 545 |
-
"value": "current multimodal
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
-
"name": "hand_trajectory_forecast:
|
| 550 |
"status": "pass",
|
| 551 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 552 |
"raw_hits": []
|
| 553 |
},
|
| 554 |
{
|
|
@@ -564,9 +570,9 @@
|
|
| 564 |
"raw_hits": []
|
| 565 |
},
|
| 566 |
{
|
| 567 |
-
"name": "hand_trajectory_forecast:
|
| 568 |
"status": "pass",
|
| 569 |
-
"value": "
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
|
@@ -575,12 +581,6 @@
|
|
| 575 |
"value": "Predict where the hands will move over the next few frames.",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
-
{
|
| 579 |
-
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 580 |
-
"status": "pass",
|
| 581 |
-
"value": "current features -> future mocap target -> regression head",
|
| 582 |
-
"raw_hits": []
|
| 583 |
-
},
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
@@ -658,15 +658,21 @@
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
-
"name": "contact_prediction:
|
| 662 |
"status": "pass",
|
| 663 |
-
"value": "
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
-
"name": "contact_prediction:
|
| 668 |
"status": "pass",
|
| 669 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 670 |
"raw_hits": []
|
| 671 |
},
|
| 672 |
{
|
|
@@ -682,9 +688,9 @@
|
|
| 682 |
"raw_hits": []
|
| 683 |
},
|
| 684 |
{
|
| 685 |
-
"name": "contact_prediction:
|
| 686 |
"status": "pass",
|
| 687 |
-
"value": "
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
|
@@ -693,12 +699,6 @@
|
|
| 693 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
-
{
|
| 697 |
-
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 698 |
-
"status": "pass",
|
| 699 |
-
"value": "feature filter -> contact target -> binary classifier",
|
| 700 |
-
"raw_hits": []
|
| 701 |
-
},
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
@@ -774,15 +774,21 @@
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
-
"name": "object_relevance:
|
| 778 |
"status": "pass",
|
| 779 |
-
"value": "non-caption
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
-
"name": "object_relevance:
|
| 784 |
"status": "pass",
|
| 785 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 786 |
"raw_hits": []
|
| 787 |
},
|
| 788 |
{
|
|
@@ -798,9 +804,9 @@
|
|
| 798 |
"raw_hits": []
|
| 799 |
},
|
| 800 |
{
|
| 801 |
-
"name": "object_relevance:
|
| 802 |
"status": "pass",
|
| 803 |
-
"value": "
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
|
@@ -809,12 +815,6 @@
|
|
| 809 |
"value": "Predict which objects matter in the current window.",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
-
{
|
| 813 |
-
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 814 |
-
"status": "pass",
|
| 815 |
-
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 816 |
-
"raw_hits": []
|
| 817 |
-
},
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
@@ -892,15 +892,21 @@
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
-
"name": "caption_grounding:
|
| 896 |
"status": "pass",
|
| 897 |
-
"value": "
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
-
"name": "caption_grounding:
|
| 902 |
"status": "pass",
|
| 903 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 904 |
"raw_hits": []
|
| 905 |
},
|
| 906 |
{
|
|
@@ -916,9 +922,9 @@
|
|
| 916 |
"raw_hits": []
|
| 917 |
},
|
| 918 |
{
|
| 919 |
-
"name": "caption_grounding:
|
| 920 |
"status": "pass",
|
| 921 |
-
"value": "
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
|
@@ -927,12 +933,6 @@
|
|
| 927 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
-
{
|
| 931 |
-
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 932 |
-
"status": "pass",
|
| 933 |
-
"value": "query features -> candidate index -> cosine ranker",
|
| 934 |
-
"raw_hits": []
|
| 935 |
-
},
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
@@ -1008,15 +1008,21 @@
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
-
"name": "cross_modal_retrieval:
|
| 1012 |
"status": "pass",
|
| 1013 |
-
"value": "motion
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
-
"name": "cross_modal_retrieval:
|
| 1018 |
"status": "pass",
|
| 1019 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1020 |
"raw_hits": []
|
| 1021 |
},
|
| 1022 |
{
|
|
@@ -1032,9 +1038,9 @@
|
|
| 1032 |
"raw_hits": []
|
| 1033 |
},
|
| 1034 |
{
|
| 1035 |
-
"name": "cross_modal_retrieval:
|
| 1036 |
"status": "pass",
|
| 1037 |
-
"value": "
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
|
@@ -1043,12 +1049,6 @@
|
|
| 1043 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
-
{
|
| 1047 |
-
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1048 |
-
"status": "pass",
|
| 1049 |
-
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1050 |
-
"raw_hits": []
|
| 1051 |
-
},
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
@@ -1126,15 +1126,21 @@
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
-
"name": "modality_reconstruction:
|
| 1130 |
"status": "pass",
|
| 1131 |
-
"value": "motion, IMU, and camera
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
-
"name": "modality_reconstruction:
|
| 1136 |
"status": "pass",
|
| 1137 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1138 |
"raw_hits": []
|
| 1139 |
},
|
| 1140 |
{
|
|
@@ -1150,9 +1156,9 @@
|
|
| 1150 |
"raw_hits": []
|
| 1151 |
},
|
| 1152 |
{
|
| 1153 |
-
"name": "modality_reconstruction:
|
| 1154 |
"status": "pass",
|
| 1155 |
-
"value": "
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
|
@@ -1161,12 +1167,6 @@
|
|
| 1161 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
-
{
|
| 1165 |
-
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1166 |
-
"status": "pass",
|
| 1167 |
-
"value": "source-target split -> scaler -> regression head",
|
| 1168 |
-
"raw_hits": []
|
| 1169 |
-
},
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
@@ -1243,12 +1243,6 @@
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
| 1246 |
-
{
|
| 1247 |
-
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1248 |
-
"status": "pass",
|
| 1249 |
-
"value": "two adjacent windows plus difference vector",
|
| 1250 |
-
"raw_hits": []
|
| 1251 |
-
},
|
| 1252 |
{
|
| 1253 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1254 |
"status": "pass",
|
|
@@ -1256,27 +1250,27 @@
|
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
-
"name": "temporal_order:
|
| 1260 |
"status": "pass",
|
| 1261 |
"value": "Temporal Order Verification",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
-
"name": "temporal_order:
|
| 1266 |
"status": "pass",
|
| 1267 |
-
"value": "
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
-
"name": "temporal_order:
|
| 1272 |
"status": "pass",
|
| 1273 |
"value": "Temporal Order Verification",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
| 1277 |
-
"name": "temporal_order:
|
| 1278 |
"status": "pass",
|
| 1279 |
-
"value": "
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
{
|
|
@@ -1285,6 +1279,12 @@
|
|
| 1285 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1286 |
"raw_hits": []
|
| 1287 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
@@ -1360,15 +1360,21 @@
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
-
"name": "misalignment_detection:
|
| 1364 |
"status": "pass",
|
| 1365 |
-
"value": "motion
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
-
"name": "misalignment_detection:
|
| 1370 |
"status": "pass",
|
| 1371 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1372 |
"raw_hits": []
|
| 1373 |
},
|
| 1374 |
{
|
|
@@ -1384,9 +1390,9 @@
|
|
| 1384 |
"raw_hits": []
|
| 1385 |
},
|
| 1386 |
{
|
| 1387 |
-
"name": "misalignment_detection:
|
| 1388 |
"status": "pass",
|
| 1389 |
-
"value": "
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
|
@@ -1395,12 +1401,6 @@
|
|
| 1395 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
-
{
|
| 1399 |
-
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1400 |
-
"status": "pass",
|
| 1401 |
-
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1402 |
-
"raw_hits": []
|
| 1403 |
-
},
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:59+00:00",
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
+
"name": "timeline_action: public_field_card_blurb_is_human_readable",
|
| 68 |
"status": "pass",
|
| 69 |
+
"value": "Recognize the current manipulation action from synchronized visual, motion, inertial, pose, and annotation context.",
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
+
"name": "timeline_action: public_field_research_name_is_human_readable",
|
| 74 |
"status": "pass",
|
| 75 |
+
"value": "Egocentric Action Recognition",
|
| 76 |
+
"raw_hits": []
|
| 77 |
+
},
|
| 78 |
+
{
|
| 79 |
+
"name": "timeline_action: public_field_input_short_is_human_readable",
|
| 80 |
+
"status": "pass",
|
| 81 |
+
"value": "20-frame multimodal window",
|
| 82 |
"raw_hits": []
|
| 83 |
},
|
| 84 |
{
|
|
|
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
| 97 |
+
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 98 |
"status": "pass",
|
| 99 |
+
"value": "window features -> action label builder -> classifier",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
{
|
|
|
|
| 105 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 106 |
"raw_hits": []
|
| 107 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
+
"name": "timeline_subtask: public_field_card_blurb_is_human_readable",
|
| 188 |
"status": "pass",
|
| 189 |
+
"value": "Recognize the broader activity stage so fine actions become a readable procedure timeline.",
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
+
"name": "timeline_subtask: public_field_research_name_is_human_readable",
|
| 194 |
"status": "pass",
|
| 195 |
+
"value": "Temporal Subtask Recognition",
|
| 196 |
+
"raw_hits": []
|
| 197 |
+
},
|
| 198 |
+
{
|
| 199 |
+
"name": "timeline_subtask: public_field_input_short_is_human_readable",
|
| 200 |
+
"status": "pass",
|
| 201 |
+
"value": "20-frame multimodal window",
|
| 202 |
"raw_hits": []
|
| 203 |
},
|
| 204 |
{
|
|
|
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
| 217 |
+
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 218 |
"status": "pass",
|
| 219 |
+
"value": "window features -> subtask label builder -> classifier",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
{
|
|
|
|
| 225 |
"value": "Predict the higher-level task stage for the current window.",
|
| 226 |
"raw_hits": []
|
| 227 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
+
"name": "transition_detection: public_field_card_blurb_is_human_readable",
|
| 308 |
"status": "pass",
|
| 309 |
+
"value": "Detect the local moment where the episode changes from one action segment to the next.",
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
+
"name": "transition_detection: public_field_research_name_is_human_readable",
|
| 314 |
"status": "pass",
|
| 315 |
+
"value": "Temporal Action Segmentation",
|
| 316 |
+
"raw_hits": []
|
| 317 |
+
},
|
| 318 |
+
{
|
| 319 |
+
"name": "transition_detection: public_field_input_short_is_human_readable",
|
| 320 |
+
"status": "pass",
|
| 321 |
+
"value": "current window with boundary target",
|
| 322 |
"raw_hits": []
|
| 323 |
},
|
| 324 |
{
|
|
|
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
| 337 |
+
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 338 |
"status": "pass",
|
| 339 |
+
"value": "action changes -> boundary labels -> binary classifier",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
{
|
|
|
|
| 345 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 346 |
"raw_hits": []
|
| 347 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
+
"name": "next_action: public_field_card_blurb_is_human_readable",
|
| 426 |
"status": "pass",
|
| 427 |
+
"value": "Forecast the near-future action from the current observations only.",
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
+
"name": "next_action: public_field_research_name_is_human_readable",
|
| 432 |
"status": "pass",
|
| 433 |
+
"value": "Short-Horizon Intention Prediction",
|
| 434 |
+
"raw_hits": []
|
| 435 |
+
},
|
| 436 |
+
{
|
| 437 |
+
"name": "next_action: public_field_input_short_is_human_readable",
|
| 438 |
+
"status": "pass",
|
| 439 |
+
"value": "current window at time t",
|
| 440 |
"raw_hits": []
|
| 441 |
},
|
| 442 |
{
|
|
|
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
| 455 |
+
"name": "next_action: public_field_process_short_is_human_readable",
|
| 456 |
"status": "pass",
|
| 457 |
+
"value": "current features -> future label shift -> classifier",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
{
|
|
|
|
| 463 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 464 |
"raw_hits": []
|
| 465 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
+
"name": "hand_trajectory_forecast: public_field_card_blurb_is_human_readable",
|
| 544 |
"status": "pass",
|
| 545 |
+
"value": "Predict the future 3D left/right hand path from the current multimodal state.",
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
+
"name": "hand_trajectory_forecast: public_field_research_name_is_human_readable",
|
| 550 |
"status": "pass",
|
| 551 |
+
"value": "3D Hand Motion Forecasting",
|
| 552 |
+
"raw_hits": []
|
| 553 |
+
},
|
| 554 |
+
{
|
| 555 |
+
"name": "hand_trajectory_forecast: public_field_input_short_is_human_readable",
|
| 556 |
+
"status": "pass",
|
| 557 |
+
"value": "current multimodal window",
|
| 558 |
"raw_hits": []
|
| 559 |
},
|
| 560 |
{
|
|
|
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
| 573 |
+
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 574 |
"status": "pass",
|
| 575 |
+
"value": "current features -> future mocap target -> regression head",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
{
|
|
|
|
| 581 |
"value": "Predict where the hands will move over the next few frames.",
|
| 582 |
"raw_hits": []
|
| 583 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
+
"name": "contact_prediction: public_field_card_blurb_is_human_readable",
|
| 662 |
"status": "pass",
|
| 663 |
+
"value": "Predict whether body or hand contact with the scene is occurring without leaking contact labels.",
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
+
"name": "contact_prediction: public_field_research_name_is_human_readable",
|
| 668 |
"status": "pass",
|
| 669 |
+
"value": "Human-Object Contact Prediction",
|
| 670 |
+
"raw_hits": []
|
| 671 |
+
},
|
| 672 |
+
{
|
| 673 |
+
"name": "contact_prediction: public_field_input_short_is_human_readable",
|
| 674 |
+
"status": "pass",
|
| 675 |
+
"value": "non-contact, non-caption features",
|
| 676 |
"raw_hits": []
|
| 677 |
},
|
| 678 |
{
|
|
|
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
| 691 |
+
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 692 |
"status": "pass",
|
| 693 |
+
"value": "feature filter -> contact target -> binary classifier",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
{
|
|
|
|
| 699 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 700 |
"raw_hits": []
|
| 701 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
+
"name": "object_relevance: public_field_card_blurb_is_human_readable",
|
| 778 |
"status": "pass",
|
| 779 |
+
"value": "Infer which objects are relevant to the current manipulation window from non-caption features.",
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
+
"name": "object_relevance: public_field_research_name_is_human_readable",
|
| 784 |
"status": "pass",
|
| 785 |
+
"value": "Object-Centric Interaction Recognition",
|
| 786 |
+
"raw_hits": []
|
| 787 |
+
},
|
| 788 |
+
{
|
| 789 |
+
"name": "object_relevance: public_field_input_short_is_human_readable",
|
| 790 |
+
"status": "pass",
|
| 791 |
+
"value": "non-caption multimodal features",
|
| 792 |
"raw_hits": []
|
| 793 |
},
|
| 794 |
{
|
|
|
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
| 807 |
+
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 808 |
"status": "pass",
|
| 809 |
+
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
{
|
|
|
|
| 815 |
"value": "Predict which objects matter in the current window.",
|
| 816 |
"raw_hits": []
|
| 817 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
+
"name": "caption_grounding: public_field_card_blurb_is_human_readable",
|
| 896 |
"status": "pass",
|
| 897 |
+
"value": "Retrieve the matching time window for an annotation-derived text query.",
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
+
"name": "caption_grounding: public_field_research_name_is_human_readable",
|
| 902 |
"status": "pass",
|
| 903 |
+
"value": "Language-to-Moment Grounding",
|
| 904 |
+
"raw_hits": []
|
| 905 |
+
},
|
| 906 |
+
{
|
| 907 |
+
"name": "caption_grounding: public_field_input_short_is_human_readable",
|
| 908 |
+
"status": "pass",
|
| 909 |
+
"value": "text-like query and candidate windows",
|
| 910 |
"raw_hits": []
|
| 911 |
},
|
| 912 |
{
|
|
|
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
| 925 |
+
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 926 |
"status": "pass",
|
| 927 |
+
"value": "query features -> candidate index -> cosine ranker",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
{
|
|
|
|
| 933 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 934 |
"raw_hits": []
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
+
"name": "cross_modal_retrieval: public_field_card_blurb_is_human_readable",
|
| 1012 |
"status": "pass",
|
| 1013 |
+
"value": "Use motion, IMU, and camera-pose signals to retrieve the matching depth/video window.",
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
+
"name": "cross_modal_retrieval: public_field_research_name_is_human_readable",
|
| 1018 |
"status": "pass",
|
| 1019 |
+
"value": "Multimodal Representation Retrieval",
|
| 1020 |
+
"raw_hits": []
|
| 1021 |
+
},
|
| 1022 |
+
{
|
| 1023 |
+
"name": "cross_modal_retrieval: public_field_input_short_is_human_readable",
|
| 1024 |
+
"status": "pass",
|
| 1025 |
+
"value": "motion/IMU/pose query; depth/video candidates",
|
| 1026 |
"raw_hits": []
|
| 1027 |
},
|
| 1028 |
{
|
|
|
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
| 1041 |
+
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1042 |
"status": "pass",
|
| 1043 |
+
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
{
|
|
|
|
| 1049 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1050 |
"raw_hits": []
|
| 1051 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
+
"name": "modality_reconstruction: public_field_card_blurb_is_human_readable",
|
| 1130 |
"status": "pass",
|
| 1131 |
+
"value": "Predict compressed depth/video feature vectors from motion, IMU, and camera-pose features.",
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
+
"name": "modality_reconstruction: public_field_research_name_is_human_readable",
|
| 1136 |
"status": "pass",
|
| 1137 |
+
"value": "Modality Feature Reconstruction",
|
| 1138 |
+
"raw_hits": []
|
| 1139 |
+
},
|
| 1140 |
+
{
|
| 1141 |
+
"name": "modality_reconstruction: public_field_input_short_is_human_readable",
|
| 1142 |
+
"status": "pass",
|
| 1143 |
+
"value": "motion, IMU, and camera/pose features",
|
| 1144 |
"raw_hits": []
|
| 1145 |
},
|
| 1146 |
{
|
|
|
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
| 1159 |
+
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1160 |
"status": "pass",
|
| 1161 |
+
"value": "source-target split -> scaler -> regression head",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
{
|
|
|
|
| 1167 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1168 |
"raw_hits": []
|
| 1169 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1246 |
{
|
| 1247 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1248 |
"status": "pass",
|
|
|
|
| 1250 |
"raw_hits": []
|
| 1251 |
},
|
| 1252 |
{
|
| 1253 |
+
"name": "temporal_order: public_field_research_name_is_human_readable",
|
| 1254 |
"status": "pass",
|
| 1255 |
"value": "Temporal Order Verification",
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
+
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1260 |
"status": "pass",
|
| 1261 |
+
"value": "two adjacent windows plus difference vector",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
+
"name": "temporal_order: public_field_display_name_is_human_readable",
|
| 1266 |
"status": "pass",
|
| 1267 |
"value": "Temporal Order Verification",
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
+
"name": "temporal_order: public_field_output_short_is_human_readable",
|
| 1272 |
"status": "pass",
|
| 1273 |
+
"value": "correct or reversed",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
|
|
|
| 1279 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
+
{
|
| 1283 |
+
"name": "temporal_order: public_field_plain_goal_is_human_readable",
|
| 1284 |
+
"status": "pass",
|
| 1285 |
+
"value": "Tell whether two nearby windows are in the correct time order.",
|
| 1286 |
+
"raw_hits": []
|
| 1287 |
+
},
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
+
"name": "misalignment_detection: public_field_card_blurb_is_human_readable",
|
| 1364 |
"status": "pass",
|
| 1365 |
+
"value": "Detect whether motion and visual/depth streams have been artificially shifted out of sync.",
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
+
"name": "misalignment_detection: public_field_research_name_is_human_readable",
|
| 1370 |
"status": "pass",
|
| 1371 |
+
"value": "Cross-Modal Misalignment Detection",
|
| 1372 |
+
"raw_hits": []
|
| 1373 |
+
},
|
| 1374 |
+
{
|
| 1375 |
+
"name": "misalignment_detection: public_field_input_short_is_human_readable",
|
| 1376 |
+
"status": "pass",
|
| 1377 |
+
"value": "motion-side and visual/depth-side feature groups",
|
| 1378 |
"raw_hits": []
|
| 1379 |
},
|
| 1380 |
{
|
|
|
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
| 1393 |
+
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1394 |
"status": "pass",
|
| 1395 |
+
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
{
|
|
|
|
| 1401 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1402 |
"raw_hits": []
|
| 1403 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
docs/data/website_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
@@ -251,7 +251,7 @@
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
-
"bytes":
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
@@ -291,7 +291,7 @@
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
-
"bytes":
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
@@ -301,7 +301,7 @@
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
-
"bytes":
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
@@ -321,7 +321,7 @@
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
-
"bytes":
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
+
"bytes": 39486,
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
+
"bytes": 126335,
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
+
"bytes": 4142,
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
+
"bytes": 11274,
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
metrics/artifact_index.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"status": "pass",
|
| 5 |
-
"artifact_count":
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
-
"scaleup_contract":
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
@@ -28,7 +28,7 @@
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
-
"scaleup_status":
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
@@ -63,8 +63,8 @@
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
-
"bytes":
|
| 67 |
-
"sha256": "
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
@@ -74,8 +74,8 @@
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
-
"bytes":
|
| 78 |
-
"sha256": "
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
@@ -187,6 +187,17 @@
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
{
|
| 191 |
"id": "additional_development_directions",
|
| 192 |
"title": "Additional development directions",
|
|
@@ -250,8 +261,8 @@
|
|
| 250 |
"surface": "repo_hf",
|
| 251 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 252 |
"exists": true,
|
| 253 |
-
"bytes":
|
| 254 |
-
"sha256": "
|
| 255 |
},
|
| 256 |
{
|
| 257 |
"id": "official_dataset_card_alignment",
|
|
@@ -695,8 +706,8 @@
|
|
| 695 |
"surface": "repo_hf",
|
| 696 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 697 |
"exists": true,
|
| 698 |
-
"bytes":
|
| 699 |
-
"sha256": "
|
| 700 |
},
|
| 701 |
{
|
| 702 |
"id": "publication_audit",
|
|
@@ -731,7 +742,7 @@
|
|
| 731 |
"volatile": true,
|
| 732 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 733 |
"exists": true,
|
| 734 |
-
"bytes":
|
| 735 |
"hash_policy": "existence_and_size_only"
|
| 736 |
},
|
| 737 |
{
|
|
@@ -933,6 +944,28 @@
|
|
| 933 |
"bytes": 3076,
|
| 934 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"id": "citation",
|
| 938 |
"title": "Citation metadata",
|
|
|
|
| 1 |
{
|
| 2 |
"title": "Ropedia Xperience-10M Task Suite Artifact Index",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:45+00:00",
|
| 4 |
"status": "pass",
|
| 5 |
+
"artifact_count": 86,
|
| 6 |
"missing": [],
|
| 7 |
"by_kind": {
|
| 8 |
"project_path": 14,
|
| 9 |
+
"scaleup_contract": 7,
|
| 10 |
"project_scope": 1,
|
| 11 |
"source_alignment": 5,
|
| 12 |
"publication_workflow": 3,
|
|
|
|
| 28 |
"onboarding_doc": 1,
|
| 29 |
"generated_figure": 3,
|
| 30 |
"generated_figure_assets": 1,
|
| 31 |
+
"scaleup_status": 4,
|
| 32 |
"citation": 1,
|
| 33 |
"license": 1
|
| 34 |
},
|
|
|
|
| 63 |
"surface": "repo_hf",
|
| 64 |
"shows": "Gives a compact current-state table for first-pass readers.",
|
| 65 |
"exists": true,
|
| 66 |
+
"bytes": 8805,
|
| 67 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 68 |
},
|
| 69 |
{
|
| 70 |
"id": "project_status_json",
|
|
|
|
| 74 |
"surface": "website_hf",
|
| 75 |
"shows": "Machine-readable copy of the current project status for website and HF mirrors.",
|
| 76 |
"exists": true,
|
| 77 |
+
"bytes": 11274,
|
| 78 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 79 |
},
|
| 80 |
{
|
| 81 |
"id": "research_roadmap",
|
|
|
|
| 187 |
"bytes": 6519,
|
| 188 |
"sha256": "a3773fc681e298325e2be80556d6be6e7e30b90ba22ee24b66633f07ff9c4ea4"
|
| 189 |
},
|
| 190 |
+
{
|
| 191 |
+
"id": "qwen3_omni_error_analysis_script",
|
| 192 |
+
"title": "Qwen3-Omni held-out error-analysis script",
|
| 193 |
+
"path": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 194 |
+
"kind": "scaleup_contract",
|
| 195 |
+
"surface": "repo_hf",
|
| 196 |
+
"shows": "Computes public-safe held-out error-analysis tables by episode, action family, train-seen status, required-modality state, and object category.",
|
| 197 |
+
"exists": true,
|
| 198 |
+
"bytes": 15676,
|
| 199 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 200 |
+
},
|
| 201 |
{
|
| 202 |
"id": "additional_development_directions",
|
| 203 |
"title": "Additional development directions",
|
|
|
|
| 261 |
"surface": "repo_hf",
|
| 262 |
"shows": "Gives the human-readable map from project scope to data, tasks, platform mirrors, and scale-up status.",
|
| 263 |
"exists": true,
|
| 264 |
+
"bytes": 16318,
|
| 265 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 266 |
},
|
| 267 |
{
|
| 268 |
"id": "official_dataset_card_alignment",
|
|
|
|
| 706 |
"surface": "repo_hf",
|
| 707 |
"shows": "Generates the selective artifact catalog from local files.",
|
| 708 |
"exists": true,
|
| 709 |
+
"bytes": 32191,
|
| 710 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 711 |
},
|
| 712 |
{
|
| 713 |
"id": "publication_audit",
|
|
|
|
| 742 |
"volatile": true,
|
| 743 |
"shows": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
|
| 744 |
"exists": true,
|
| 745 |
+
"bytes": 126335,
|
| 746 |
"hash_policy": "existence_and_size_only"
|
| 747 |
},
|
| 748 |
{
|
|
|
|
| 944 |
"bytes": 3076,
|
| 945 |
"sha256": "23b87581cfc1d95b0af118a0dbb4e601f42fc6bad608759490e13a9a1ef73205"
|
| 946 |
},
|
| 947 |
+
{
|
| 948 |
+
"id": "qwen3_omni_error_analysis_report",
|
| 949 |
+
"title": "Qwen3-Omni held-out error-analysis report",
|
| 950 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 951 |
+
"kind": "scaleup_status",
|
| 952 |
+
"surface": "repo_hf",
|
| 953 |
+
"shows": "Summarizes validation-aware Qwen3-Omni held-out failures by episode, action family, train-seen status, required-modality state, and object category.",
|
| 954 |
+
"exists": true,
|
| 955 |
+
"bytes": 3331,
|
| 956 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 957 |
+
},
|
| 958 |
+
{
|
| 959 |
+
"id": "qwen3_omni_error_analysis_json",
|
| 960 |
+
"title": "Qwen3-Omni held-out error-analysis JSON",
|
| 961 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 962 |
+
"kind": "scaleup_status",
|
| 963 |
+
"surface": "repo_hf",
|
| 964 |
+
"shows": "Machine-readable Qwen3-Omni held-out error analysis with grouped metrics and sanitized failure examples.",
|
| 965 |
+
"exists": true,
|
| 966 |
+
"bytes": 25202,
|
| 967 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 968 |
+
},
|
| 969 |
{
|
| 970 |
"id": "citation",
|
| 971 |
"title": "Citation metadata",
|
metrics/mirror_parity.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
-
"group_count":
|
| 7 |
"failure_count": 0,
|
| 8 |
"failures_by_surface": {}
|
| 9 |
},
|
|
@@ -102,27 +102,27 @@
|
|
| 102 |
"local": {
|
| 103 |
"path": "repo:docs/data/artifact_index.json",
|
| 104 |
"exists": true,
|
| 105 |
-
"bytes":
|
| 106 |
-
"sha256": "
|
| 107 |
},
|
| 108 |
"mirrors": {
|
| 109 |
"hf_space": {
|
| 110 |
"path": "hf_space:data/artifact_index.json",
|
| 111 |
"exists": true,
|
| 112 |
-
"bytes":
|
| 113 |
-
"sha256": "
|
| 114 |
},
|
| 115 |
"hf_artifacts": {
|
| 116 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 117 |
"exists": true,
|
| 118 |
-
"bytes":
|
| 119 |
-
"sha256": "
|
| 120 |
},
|
| 121 |
"hf_model": {
|
| 122 |
"path": "hf_model:metrics/artifact_index.json",
|
| 123 |
"exists": true,
|
| 124 |
-
"bytes":
|
| 125 |
-
"sha256": "
|
| 126 |
}
|
| 127 |
},
|
| 128 |
"failures": []
|
|
@@ -350,27 +350,27 @@
|
|
| 350 |
"local": {
|
| 351 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 352 |
"exists": true,
|
| 353 |
-
"bytes":
|
| 354 |
-
"sha256": "
|
| 355 |
},
|
| 356 |
"mirrors": {
|
| 357 |
"hf_space": {
|
| 358 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 359 |
"exists": true,
|
| 360 |
-
"bytes":
|
| 361 |
-
"sha256": "
|
| 362 |
},
|
| 363 |
"hf_artifacts": {
|
| 364 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 365 |
"exists": true,
|
| 366 |
-
"bytes":
|
| 367 |
-
"sha256": "
|
| 368 |
},
|
| 369 |
"hf_model": {
|
| 370 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 371 |
"exists": true,
|
| 372 |
-
"bytes":
|
| 373 |
-
"sha256": "
|
| 374 |
}
|
| 375 |
},
|
| 376 |
"failures": []
|
|
@@ -474,27 +474,27 @@
|
|
| 474 |
"local": {
|
| 475 |
"path": "repo:docs/data/project_status.json",
|
| 476 |
"exists": true,
|
| 477 |
-
"bytes":
|
| 478 |
-
"sha256": "
|
| 479 |
},
|
| 480 |
"mirrors": {
|
| 481 |
"hf_space": {
|
| 482 |
"path": "hf_space:data/project_status.json",
|
| 483 |
"exists": true,
|
| 484 |
-
"bytes":
|
| 485 |
-
"sha256": "
|
| 486 |
},
|
| 487 |
"hf_artifacts": {
|
| 488 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 489 |
"exists": true,
|
| 490 |
-
"bytes":
|
| 491 |
-
"sha256": "
|
| 492 |
},
|
| 493 |
"hf_model": {
|
| 494 |
"path": "hf_model:metrics/project_status.json",
|
| 495 |
"exists": true,
|
| 496 |
-
"bytes":
|
| 497 |
-
"sha256": "
|
| 498 |
}
|
| 499 |
},
|
| 500 |
"failures": []
|
|
@@ -506,26 +506,26 @@
|
|
| 506 |
"path": "repo:docs/data/publication_audit.json",
|
| 507 |
"exists": true,
|
| 508 |
"bytes": 7237,
|
| 509 |
-
"sha256": "
|
| 510 |
},
|
| 511 |
"mirrors": {
|
| 512 |
"hf_space": {
|
| 513 |
"path": "hf_space:data/publication_audit.json",
|
| 514 |
"exists": true,
|
| 515 |
"bytes": 7237,
|
| 516 |
-
"sha256": "
|
| 517 |
},
|
| 518 |
"hf_artifacts": {
|
| 519 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 520 |
"exists": true,
|
| 521 |
"bytes": 7237,
|
| 522 |
-
"sha256": "
|
| 523 |
},
|
| 524 |
"hf_model": {
|
| 525 |
"path": "hf_model:metrics/publication_audit.json",
|
| 526 |
"exists": true,
|
| 527 |
"bytes": 7237,
|
| 528 |
-
"sha256": "
|
| 529 |
}
|
| 530 |
},
|
| 531 |
"failures": []
|
|
@@ -816,26 +816,26 @@
|
|
| 816 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 817 |
"exists": true,
|
| 818 |
"bytes": 20823,
|
| 819 |
-
"sha256": "
|
| 820 |
},
|
| 821 |
"mirrors": {
|
| 822 |
"hf_space": {
|
| 823 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 824 |
"exists": true,
|
| 825 |
"bytes": 20823,
|
| 826 |
-
"sha256": "
|
| 827 |
},
|
| 828 |
"hf_artifacts": {
|
| 829 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 830 |
"exists": true,
|
| 831 |
"bytes": 20823,
|
| 832 |
-
"sha256": "
|
| 833 |
},
|
| 834 |
"hf_model": {
|
| 835 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 836 |
"exists": true,
|
| 837 |
"bytes": 20823,
|
| 838 |
-
"sha256": "
|
| 839 |
}
|
| 840 |
},
|
| 841 |
"failures": []
|
|
@@ -940,26 +940,26 @@
|
|
| 940 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 941 |
"exists": true,
|
| 942 |
"bytes": 45779,
|
| 943 |
-
"sha256": "
|
| 944 |
},
|
| 945 |
"mirrors": {
|
| 946 |
"hf_space": {
|
| 947 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 948 |
"exists": true,
|
| 949 |
"bytes": 45779,
|
| 950 |
-
"sha256": "
|
| 951 |
},
|
| 952 |
"hf_artifacts": {
|
| 953 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 954 |
"exists": true,
|
| 955 |
"bytes": 45779,
|
| 956 |
-
"sha256": "
|
| 957 |
},
|
| 958 |
"hf_model": {
|
| 959 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 960 |
"exists": true,
|
| 961 |
"bytes": 45779,
|
| 962 |
-
"sha256": "
|
| 963 |
}
|
| 964 |
},
|
| 965 |
"failures": []
|
|
@@ -1002,26 +1002,26 @@
|
|
| 1002 |
"path": "repo:docs/data/website_integrity.json",
|
| 1003 |
"exists": true,
|
| 1004 |
"bytes": 15221,
|
| 1005 |
-
"sha256": "
|
| 1006 |
},
|
| 1007 |
"mirrors": {
|
| 1008 |
"hf_space": {
|
| 1009 |
"path": "hf_space:data/website_integrity.json",
|
| 1010 |
"exists": true,
|
| 1011 |
"bytes": 15221,
|
| 1012 |
-
"sha256": "
|
| 1013 |
},
|
| 1014 |
"hf_artifacts": {
|
| 1015 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1016 |
"exists": true,
|
| 1017 |
"bytes": 15221,
|
| 1018 |
-
"sha256": "
|
| 1019 |
},
|
| 1020 |
"hf_model": {
|
| 1021 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1022 |
"exists": true,
|
| 1023 |
"bytes": 15221,
|
| 1024 |
-
"sha256": "
|
| 1025 |
}
|
| 1026 |
},
|
| 1027 |
"failures": []
|
|
@@ -1723,6 +1723,31 @@
|
|
| 1723 |
},
|
| 1724 |
"failures": []
|
| 1725 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1726 |
{
|
| 1727 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1728 |
"status": "pass",
|
|
@@ -1754,21 +1779,21 @@
|
|
| 1754 |
"local": {
|
| 1755 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1756 |
"exists": true,
|
| 1757 |
-
"bytes":
|
| 1758 |
-
"sha256": "
|
| 1759 |
},
|
| 1760 |
"mirrors": {
|
| 1761 |
"hf_artifacts": {
|
| 1762 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1763 |
"exists": true,
|
| 1764 |
-
"bytes":
|
| 1765 |
-
"sha256": "
|
| 1766 |
},
|
| 1767 |
"hf_model": {
|
| 1768 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1769 |
"exists": true,
|
| 1770 |
-
"bytes":
|
| 1771 |
-
"sha256": "
|
| 1772 |
}
|
| 1773 |
},
|
| 1774 |
"failures": []
|
|
@@ -2054,21 +2079,21 @@
|
|
| 2054 |
"local": {
|
| 2055 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2056 |
"exists": true,
|
| 2057 |
-
"bytes":
|
| 2058 |
-
"sha256": "
|
| 2059 |
},
|
| 2060 |
"mirrors": {
|
| 2061 |
"hf_artifacts": {
|
| 2062 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2063 |
"exists": true,
|
| 2064 |
-
"bytes":
|
| 2065 |
-
"sha256": "
|
| 2066 |
},
|
| 2067 |
"hf_model": {
|
| 2068 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2069 |
"exists": true,
|
| 2070 |
-
"bytes":
|
| 2071 |
-
"sha256": "
|
| 2072 |
}
|
| 2073 |
},
|
| 2074 |
"failures": []
|
|
@@ -2807,6 +2832,285 @@
|
|
| 2807 |
},
|
| 2808 |
"failures": []
|
| 2809 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2810 |
{
|
| 2811 |
"name": "docs/QUALITY_GATES.md",
|
| 2812 |
"status": "pass",
|
|
@@ -3061,27 +3365,27 @@
|
|
| 3061 |
"local": {
|
| 3062 |
"path": "repo:PROJECT_STATUS.md",
|
| 3063 |
"exists": true,
|
| 3064 |
-
"bytes":
|
| 3065 |
-
"sha256": "
|
| 3066 |
},
|
| 3067 |
"mirrors": {
|
| 3068 |
"hf_space": {
|
| 3069 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3070 |
"exists": true,
|
| 3071 |
-
"bytes":
|
| 3072 |
-
"sha256": "
|
| 3073 |
},
|
| 3074 |
"hf_artifacts": {
|
| 3075 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3076 |
"exists": true,
|
| 3077 |
-
"bytes":
|
| 3078 |
-
"sha256": "
|
| 3079 |
},
|
| 3080 |
"hf_model": {
|
| 3081 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3082 |
"exists": true,
|
| 3083 |
-
"bytes":
|
| 3084 |
-
"sha256": "
|
| 3085 |
}
|
| 3086 |
},
|
| 3087 |
"failures": []
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:56:44+00:00",
|
| 4 |
"hf_root": "hf_publish",
|
| 5 |
"summary": {
|
| 6 |
+
"group_count": 114,
|
| 7 |
"failure_count": 0,
|
| 8 |
"failures_by_surface": {}
|
| 9 |
},
|
|
|
|
| 102 |
"local": {
|
| 103 |
"path": "repo:docs/data/artifact_index.json",
|
| 104 |
"exists": true,
|
| 105 |
+
"bytes": 39486,
|
| 106 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 107 |
},
|
| 108 |
"mirrors": {
|
| 109 |
"hf_space": {
|
| 110 |
"path": "hf_space:data/artifact_index.json",
|
| 111 |
"exists": true,
|
| 112 |
+
"bytes": 39486,
|
| 113 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 114 |
},
|
| 115 |
"hf_artifacts": {
|
| 116 |
"path": "hf_artifacts:docs/data/artifact_index.json",
|
| 117 |
"exists": true,
|
| 118 |
+
"bytes": 39486,
|
| 119 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 120 |
},
|
| 121 |
"hf_model": {
|
| 122 |
"path": "hf_model:metrics/artifact_index.json",
|
| 123 |
"exists": true,
|
| 124 |
+
"bytes": 39486,
|
| 125 |
+
"sha256": "87782cd08bc1106d694a727e21333450d2965b48c48f500d1b6f4294d7b247d0"
|
| 126 |
}
|
| 127 |
},
|
| 128 |
"failures": []
|
|
|
|
| 350 |
"local": {
|
| 351 |
"path": "repo:docs/data/omni_finetune_verified_result.json",
|
| 352 |
"exists": true,
|
| 353 |
+
"bytes": 4142,
|
| 354 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 355 |
},
|
| 356 |
"mirrors": {
|
| 357 |
"hf_space": {
|
| 358 |
"path": "hf_space:data/omni_finetune_verified_result.json",
|
| 359 |
"exists": true,
|
| 360 |
+
"bytes": 4142,
|
| 361 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 362 |
},
|
| 363 |
"hf_artifacts": {
|
| 364 |
"path": "hf_artifacts:docs/data/omni_finetune_verified_result.json",
|
| 365 |
"exists": true,
|
| 366 |
+
"bytes": 4142,
|
| 367 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 368 |
},
|
| 369 |
"hf_model": {
|
| 370 |
"path": "hf_model:metrics/omni_finetune_verified_result.json",
|
| 371 |
"exists": true,
|
| 372 |
+
"bytes": 4142,
|
| 373 |
+
"sha256": "297aa6fc86bc09ba7968f3c5c2db265320c0613c5ec9a36701114ba451321b81"
|
| 374 |
}
|
| 375 |
},
|
| 376 |
"failures": []
|
|
|
|
| 474 |
"local": {
|
| 475 |
"path": "repo:docs/data/project_status.json",
|
| 476 |
"exists": true,
|
| 477 |
+
"bytes": 11274,
|
| 478 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 479 |
},
|
| 480 |
"mirrors": {
|
| 481 |
"hf_space": {
|
| 482 |
"path": "hf_space:data/project_status.json",
|
| 483 |
"exists": true,
|
| 484 |
+
"bytes": 11274,
|
| 485 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 486 |
},
|
| 487 |
"hf_artifacts": {
|
| 488 |
"path": "hf_artifacts:docs/data/project_status.json",
|
| 489 |
"exists": true,
|
| 490 |
+
"bytes": 11274,
|
| 491 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 492 |
},
|
| 493 |
"hf_model": {
|
| 494 |
"path": "hf_model:metrics/project_status.json",
|
| 495 |
"exists": true,
|
| 496 |
+
"bytes": 11274,
|
| 497 |
+
"sha256": "ae2b2c520ab1e0553fa399439345edd87832fa5293d8c27ffe610ede5bfa1067"
|
| 498 |
}
|
| 499 |
},
|
| 500 |
"failures": []
|
|
|
|
| 506 |
"path": "repo:docs/data/publication_audit.json",
|
| 507 |
"exists": true,
|
| 508 |
"bytes": 7237,
|
| 509 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 510 |
},
|
| 511 |
"mirrors": {
|
| 512 |
"hf_space": {
|
| 513 |
"path": "hf_space:data/publication_audit.json",
|
| 514 |
"exists": true,
|
| 515 |
"bytes": 7237,
|
| 516 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 517 |
},
|
| 518 |
"hf_artifacts": {
|
| 519 |
"path": "hf_artifacts:docs/data/publication_audit.json",
|
| 520 |
"exists": true,
|
| 521 |
"bytes": 7237,
|
| 522 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 523 |
},
|
| 524 |
"hf_model": {
|
| 525 |
"path": "hf_model:metrics/publication_audit.json",
|
| 526 |
"exists": true,
|
| 527 |
"bytes": 7237,
|
| 528 |
+
"sha256": "8a21c29d92f3a15b835c37d7784c17fada3edbda050515deed8e440535ed046d"
|
| 529 |
}
|
| 530 |
},
|
| 531 |
"failures": []
|
|
|
|
| 816 |
"path": "repo:docs/data/scope_claims_audit.json",
|
| 817 |
"exists": true,
|
| 818 |
"bytes": 20823,
|
| 819 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 820 |
},
|
| 821 |
"mirrors": {
|
| 822 |
"hf_space": {
|
| 823 |
"path": "hf_space:data/scope_claims_audit.json",
|
| 824 |
"exists": true,
|
| 825 |
"bytes": 20823,
|
| 826 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 827 |
},
|
| 828 |
"hf_artifacts": {
|
| 829 |
"path": "hf_artifacts:docs/data/scope_claims_audit.json",
|
| 830 |
"exists": true,
|
| 831 |
"bytes": 20823,
|
| 832 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 833 |
},
|
| 834 |
"hf_model": {
|
| 835 |
"path": "hf_model:metrics/scope_claims_audit.json",
|
| 836 |
"exists": true,
|
| 837 |
"bytes": 20823,
|
| 838 |
+
"sha256": "77402dc77c4ecf5cf1e68480ae2c9822a134ae7ef4a24a7b8b9008a2509c2fa3"
|
| 839 |
}
|
| 840 |
},
|
| 841 |
"failures": []
|
|
|
|
| 940 |
"path": "repo:docs/data/task_surface_integrity.json",
|
| 941 |
"exists": true,
|
| 942 |
"bytes": 45779,
|
| 943 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 944 |
},
|
| 945 |
"mirrors": {
|
| 946 |
"hf_space": {
|
| 947 |
"path": "hf_space:data/task_surface_integrity.json",
|
| 948 |
"exists": true,
|
| 949 |
"bytes": 45779,
|
| 950 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 951 |
},
|
| 952 |
"hf_artifacts": {
|
| 953 |
"path": "hf_artifacts:docs/data/task_surface_integrity.json",
|
| 954 |
"exists": true,
|
| 955 |
"bytes": 45779,
|
| 956 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 957 |
},
|
| 958 |
"hf_model": {
|
| 959 |
"path": "hf_model:metrics/task_surface_integrity.json",
|
| 960 |
"exists": true,
|
| 961 |
"bytes": 45779,
|
| 962 |
+
"sha256": "8232e2bafa8b5157d97c018e41be5da3ec69ddb4d2020a0dcc7c6377c5575bb6"
|
| 963 |
}
|
| 964 |
},
|
| 965 |
"failures": []
|
|
|
|
| 1002 |
"path": "repo:docs/data/website_integrity.json",
|
| 1003 |
"exists": true,
|
| 1004 |
"bytes": 15221,
|
| 1005 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1006 |
},
|
| 1007 |
"mirrors": {
|
| 1008 |
"hf_space": {
|
| 1009 |
"path": "hf_space:data/website_integrity.json",
|
| 1010 |
"exists": true,
|
| 1011 |
"bytes": 15221,
|
| 1012 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1013 |
},
|
| 1014 |
"hf_artifacts": {
|
| 1015 |
"path": "hf_artifacts:docs/data/website_integrity.json",
|
| 1016 |
"exists": true,
|
| 1017 |
"bytes": 15221,
|
| 1018 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1019 |
},
|
| 1020 |
"hf_model": {
|
| 1021 |
"path": "hf_model:metrics/website_integrity.json",
|
| 1022 |
"exists": true,
|
| 1023 |
"bytes": 15221,
|
| 1024 |
+
"sha256": "dcbd09b4c4522770c43504c500eb653de706538516ee2ec72e491ffc3416c6e2"
|
| 1025 |
}
|
| 1026 |
},
|
| 1027 |
"failures": []
|
|
|
|
| 1723 |
},
|
| 1724 |
"failures": []
|
| 1725 |
},
|
| 1726 |
+
{
|
| 1727 |
+
"name": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1728 |
+
"status": "pass",
|
| 1729 |
+
"local": {
|
| 1730 |
+
"path": "repo:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1731 |
+
"exists": true,
|
| 1732 |
+
"bytes": 15676,
|
| 1733 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1734 |
+
},
|
| 1735 |
+
"mirrors": {
|
| 1736 |
+
"hf_artifacts": {
|
| 1737 |
+
"path": "hf_artifacts:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1738 |
+
"exists": true,
|
| 1739 |
+
"bytes": 15676,
|
| 1740 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1741 |
+
},
|
| 1742 |
+
"hf_model": {
|
| 1743 |
+
"path": "hf_model:scripts/omni/analyze_qwen3_omni_errors.py",
|
| 1744 |
+
"exists": true,
|
| 1745 |
+
"bytes": 15676,
|
| 1746 |
+
"sha256": "d4c7e46d9fbd5f9d84bc32374f457fd8c9d68c8faa39c77bc45770eb95d80337"
|
| 1747 |
+
}
|
| 1748 |
+
},
|
| 1749 |
+
"failures": []
|
| 1750 |
+
},
|
| 1751 |
{
|
| 1752 |
"name": "scripts/audio_ablation_and_raw_upgrade.py",
|
| 1753 |
"status": "pass",
|
|
|
|
| 1779 |
"local": {
|
| 1780 |
"path": "repo:scripts/build_artifact_index.py",
|
| 1781 |
"exists": true,
|
| 1782 |
+
"bytes": 32191,
|
| 1783 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1784 |
},
|
| 1785 |
"mirrors": {
|
| 1786 |
"hf_artifacts": {
|
| 1787 |
"path": "hf_artifacts:scripts/build_artifact_index.py",
|
| 1788 |
"exists": true,
|
| 1789 |
+
"bytes": 32191,
|
| 1790 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1791 |
},
|
| 1792 |
"hf_model": {
|
| 1793 |
"path": "hf_model:scripts/build_artifact_index.py",
|
| 1794 |
"exists": true,
|
| 1795 |
+
"bytes": 32191,
|
| 1796 |
+
"sha256": "4a105c732d2f6c54a78333d7f47e0139325ba638027e34e6acd929a90626b8e0"
|
| 1797 |
}
|
| 1798 |
},
|
| 1799 |
"failures": []
|
|
|
|
| 2079 |
"local": {
|
| 2080 |
"path": "repo:scripts/validate_mirror_parity.py",
|
| 2081 |
"exists": true,
|
| 2082 |
+
"bytes": 13781,
|
| 2083 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2084 |
},
|
| 2085 |
"mirrors": {
|
| 2086 |
"hf_artifacts": {
|
| 2087 |
"path": "hf_artifacts:scripts/validate_mirror_parity.py",
|
| 2088 |
"exists": true,
|
| 2089 |
+
"bytes": 13781,
|
| 2090 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2091 |
},
|
| 2092 |
"hf_model": {
|
| 2093 |
"path": "hf_model:scripts/validate_mirror_parity.py",
|
| 2094 |
"exists": true,
|
| 2095 |
+
"bytes": 13781,
|
| 2096 |
+
"sha256": "3659adf936b058617dde97ee4c424615a361e59f5ea74975116422dfe01768e8"
|
| 2097 |
}
|
| 2098 |
},
|
| 2099 |
"failures": []
|
|
|
|
| 2832 |
},
|
| 2833 |
"failures": []
|
| 2834 |
},
|
| 2835 |
+
{
|
| 2836 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2837 |
+
"status": "pass",
|
| 2838 |
+
"local": {
|
| 2839 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2840 |
+
"exists": true,
|
| 2841 |
+
"bytes": 3331,
|
| 2842 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2843 |
+
},
|
| 2844 |
+
"mirrors": {
|
| 2845 |
+
"hf_space": {
|
| 2846 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2847 |
+
"exists": true,
|
| 2848 |
+
"bytes": 3331,
|
| 2849 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2850 |
+
},
|
| 2851 |
+
"hf_artifacts": {
|
| 2852 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2853 |
+
"exists": true,
|
| 2854 |
+
"bytes": 3331,
|
| 2855 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2856 |
+
},
|
| 2857 |
+
"hf_model": {
|
| 2858 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 2859 |
+
"exists": true,
|
| 2860 |
+
"bytes": 3331,
|
| 2861 |
+
"sha256": "063fcc2ebd7b57ab5b281fd5e8edc629da4e1f4e5a708483ba27375d02af9467"
|
| 2862 |
+
}
|
| 2863 |
+
},
|
| 2864 |
+
"failures": []
|
| 2865 |
+
},
|
| 2866 |
+
{
|
| 2867 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2868 |
+
"status": "pass",
|
| 2869 |
+
"local": {
|
| 2870 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2871 |
+
"exists": true,
|
| 2872 |
+
"bytes": 25202,
|
| 2873 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2874 |
+
},
|
| 2875 |
+
"mirrors": {
|
| 2876 |
+
"hf_space": {
|
| 2877 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2878 |
+
"exists": true,
|
| 2879 |
+
"bytes": 25202,
|
| 2880 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2881 |
+
},
|
| 2882 |
+
"hf_artifacts": {
|
| 2883 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2884 |
+
"exists": true,
|
| 2885 |
+
"bytes": 25202,
|
| 2886 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2887 |
+
},
|
| 2888 |
+
"hf_model": {
|
| 2889 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 2890 |
+
"exists": true,
|
| 2891 |
+
"bytes": 25202,
|
| 2892 |
+
"sha256": "c2e4eaa686f5d9739a8d0bfd8ae51a453b94019489ed84a154e2bce2fa316ff5"
|
| 2893 |
+
}
|
| 2894 |
+
},
|
| 2895 |
+
"failures": []
|
| 2896 |
+
},
|
| 2897 |
+
{
|
| 2898 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2899 |
+
"status": "pass",
|
| 2900 |
+
"local": {
|
| 2901 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2902 |
+
"exists": true,
|
| 2903 |
+
"bytes": 2121,
|
| 2904 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2905 |
+
},
|
| 2906 |
+
"mirrors": {
|
| 2907 |
+
"hf_space": {
|
| 2908 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2909 |
+
"exists": true,
|
| 2910 |
+
"bytes": 2121,
|
| 2911 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2912 |
+
},
|
| 2913 |
+
"hf_artifacts": {
|
| 2914 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2915 |
+
"exists": true,
|
| 2916 |
+
"bytes": 2121,
|
| 2917 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2918 |
+
},
|
| 2919 |
+
"hf_model": {
|
| 2920 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 2921 |
+
"exists": true,
|
| 2922 |
+
"bytes": 2121,
|
| 2923 |
+
"sha256": "7f0bc74140f100b9fe444c38eb74d155605bfc5984f665e653a2cd34a5cb96bd"
|
| 2924 |
+
}
|
| 2925 |
+
},
|
| 2926 |
+
"failures": []
|
| 2927 |
+
},
|
| 2928 |
+
{
|
| 2929 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2930 |
+
"status": "pass",
|
| 2931 |
+
"local": {
|
| 2932 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2933 |
+
"exists": true,
|
| 2934 |
+
"bytes": 1320,
|
| 2935 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2936 |
+
},
|
| 2937 |
+
"mirrors": {
|
| 2938 |
+
"hf_space": {
|
| 2939 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2940 |
+
"exists": true,
|
| 2941 |
+
"bytes": 1320,
|
| 2942 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2943 |
+
},
|
| 2944 |
+
"hf_artifacts": {
|
| 2945 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2946 |
+
"exists": true,
|
| 2947 |
+
"bytes": 1320,
|
| 2948 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2949 |
+
},
|
| 2950 |
+
"hf_model": {
|
| 2951 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 2952 |
+
"exists": true,
|
| 2953 |
+
"bytes": 1320,
|
| 2954 |
+
"sha256": "e15bf22e96b887c4b00aeb8ba548f4fd72ea0aab0772cc59e9bdda517ad72430"
|
| 2955 |
+
}
|
| 2956 |
+
},
|
| 2957 |
+
"failures": []
|
| 2958 |
+
},
|
| 2959 |
+
{
|
| 2960 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2961 |
+
"status": "pass",
|
| 2962 |
+
"local": {
|
| 2963 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2964 |
+
"exists": true,
|
| 2965 |
+
"bytes": 572,
|
| 2966 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2967 |
+
},
|
| 2968 |
+
"mirrors": {
|
| 2969 |
+
"hf_space": {
|
| 2970 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2971 |
+
"exists": true,
|
| 2972 |
+
"bytes": 572,
|
| 2973 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2974 |
+
},
|
| 2975 |
+
"hf_artifacts": {
|
| 2976 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2977 |
+
"exists": true,
|
| 2978 |
+
"bytes": 572,
|
| 2979 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2980 |
+
},
|
| 2981 |
+
"hf_model": {
|
| 2982 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 2983 |
+
"exists": true,
|
| 2984 |
+
"bytes": 572,
|
| 2985 |
+
"sha256": "cb196616b6f073266087d8cb7182e36c0a761607f3082ad78c350fd99e1996e7"
|
| 2986 |
+
}
|
| 2987 |
+
},
|
| 2988 |
+
"failures": []
|
| 2989 |
+
},
|
| 2990 |
+
{
|
| 2991 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 2992 |
+
"status": "pass",
|
| 2993 |
+
"local": {
|
| 2994 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 2995 |
+
"exists": true,
|
| 2996 |
+
"bytes": 408,
|
| 2997 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 2998 |
+
},
|
| 2999 |
+
"mirrors": {
|
| 3000 |
+
"hf_space": {
|
| 3001 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3002 |
+
"exists": true,
|
| 3003 |
+
"bytes": 408,
|
| 3004 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3005 |
+
},
|
| 3006 |
+
"hf_artifacts": {
|
| 3007 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3008 |
+
"exists": true,
|
| 3009 |
+
"bytes": 408,
|
| 3010 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3011 |
+
},
|
| 3012 |
+
"hf_model": {
|
| 3013 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 3014 |
+
"exists": true,
|
| 3015 |
+
"bytes": 408,
|
| 3016 |
+
"sha256": "6447cf285b466a914055adb0aef4f3d47bf82d33a277d8ca2e6f22c4f0f2a7f7"
|
| 3017 |
+
}
|
| 3018 |
+
},
|
| 3019 |
+
"failures": []
|
| 3020 |
+
},
|
| 3021 |
+
{
|
| 3022 |
+
"name": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3023 |
+
"status": "pass",
|
| 3024 |
+
"local": {
|
| 3025 |
+
"path": "repo:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3026 |
+
"exists": true,
|
| 3027 |
+
"bytes": 1704,
|
| 3028 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3029 |
+
},
|
| 3030 |
+
"mirrors": {
|
| 3031 |
+
"hf_space": {
|
| 3032 |
+
"path": "hf_space:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3033 |
+
"exists": true,
|
| 3034 |
+
"bytes": 1704,
|
| 3035 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3036 |
+
},
|
| 3037 |
+
"hf_artifacts": {
|
| 3038 |
+
"path": "hf_artifacts:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3039 |
+
"exists": true,
|
| 3040 |
+
"bytes": 1704,
|
| 3041 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3042 |
+
},
|
| 3043 |
+
"hf_model": {
|
| 3044 |
+
"path": "hf_model:results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 3045 |
+
"exists": true,
|
| 3046 |
+
"bytes": 1704,
|
| 3047 |
+
"sha256": "f9cbd5e566ef666fe2d1050cc5bdadc7967a2056bdaa1e2e9f88fb0c22ee0ef8"
|
| 3048 |
+
}
|
| 3049 |
+
},
|
| 3050 |
+
"failures": []
|
| 3051 |
+
},
|
| 3052 |
+
{
|
| 3053 |
+
"name": "docs/ARTIFACT_GUIDE.md",
|
| 3054 |
+
"status": "pass",
|
| 3055 |
+
"local": {
|
| 3056 |
+
"path": "repo:ARTIFACT_GUIDE.md",
|
| 3057 |
+
"exists": true,
|
| 3058 |
+
"bytes": 16318,
|
| 3059 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3060 |
+
},
|
| 3061 |
+
"mirrors": {
|
| 3062 |
+
"hf_space": {
|
| 3063 |
+
"path": "hf_space:ARTIFACT_GUIDE.md",
|
| 3064 |
+
"exists": true,
|
| 3065 |
+
"bytes": 16318,
|
| 3066 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3067 |
+
},
|
| 3068 |
+
"hf_artifacts": {
|
| 3069 |
+
"path": "hf_artifacts:ARTIFACT_GUIDE.md",
|
| 3070 |
+
"exists": true,
|
| 3071 |
+
"bytes": 16318,
|
| 3072 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3073 |
+
},
|
| 3074 |
+
"hf_model": {
|
| 3075 |
+
"path": "hf_model:ARTIFACT_GUIDE.md",
|
| 3076 |
+
"exists": true,
|
| 3077 |
+
"bytes": 16318,
|
| 3078 |
+
"sha256": "cda5f4b5be4b7a2d26aff6ed7f930bfba13dfc463d533a9880193c0a0611b677"
|
| 3079 |
+
}
|
| 3080 |
+
},
|
| 3081 |
+
"failures": []
|
| 3082 |
+
},
|
| 3083 |
+
{
|
| 3084 |
+
"name": "docs/OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3085 |
+
"status": "pass",
|
| 3086 |
+
"local": {
|
| 3087 |
+
"path": "repo:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3088 |
+
"exists": true,
|
| 3089 |
+
"bytes": 8900,
|
| 3090 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3091 |
+
},
|
| 3092 |
+
"mirrors": {
|
| 3093 |
+
"hf_space": {
|
| 3094 |
+
"path": "hf_space:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3095 |
+
"exists": true,
|
| 3096 |
+
"bytes": 8900,
|
| 3097 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3098 |
+
},
|
| 3099 |
+
"hf_artifacts": {
|
| 3100 |
+
"path": "hf_artifacts:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3101 |
+
"exists": true,
|
| 3102 |
+
"bytes": 8900,
|
| 3103 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3104 |
+
},
|
| 3105 |
+
"hf_model": {
|
| 3106 |
+
"path": "hf_model:OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 3107 |
+
"exists": true,
|
| 3108 |
+
"bytes": 8900,
|
| 3109 |
+
"sha256": "c4e51d0aa7536045c229418603a67c6b3c5f31c9d756ca7395cb0c9455f0ed6d"
|
| 3110 |
+
}
|
| 3111 |
+
},
|
| 3112 |
+
"failures": []
|
| 3113 |
+
},
|
| 3114 |
{
|
| 3115 |
"name": "docs/QUALITY_GATES.md",
|
| 3116 |
"status": "pass",
|
|
|
|
| 3365 |
"local": {
|
| 3366 |
"path": "repo:PROJECT_STATUS.md",
|
| 3367 |
"exists": true,
|
| 3368 |
+
"bytes": 8805,
|
| 3369 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3370 |
},
|
| 3371 |
"mirrors": {
|
| 3372 |
"hf_space": {
|
| 3373 |
"path": "hf_space:PROJECT_STATUS.md",
|
| 3374 |
"exists": true,
|
| 3375 |
+
"bytes": 8805,
|
| 3376 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3377 |
},
|
| 3378 |
"hf_artifacts": {
|
| 3379 |
"path": "hf_artifacts:PROJECT_STATUS.md",
|
| 3380 |
"exists": true,
|
| 3381 |
+
"bytes": 8805,
|
| 3382 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3383 |
},
|
| 3384 |
"hf_model": {
|
| 3385 |
"path": "hf_model:PROJECT_STATUS.md",
|
| 3386 |
"exists": true,
|
| 3387 |
+
"bytes": 8805,
|
| 3388 |
+
"sha256": "4051b78674306078880de33a144a499144b2487b11455c70a364a94cefa035a7"
|
| 3389 |
}
|
| 3390 |
},
|
| 3391 |
"failures": []
|
metrics/omni_finetune_verified_result.json
CHANGED
|
@@ -67,7 +67,28 @@
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
-
"contains_lora_weights": false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
},
|
| 72 |
"required_next_steps": [
|
| 73 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
|
|
|
| 67 |
"audit_status": "pass",
|
| 68 |
"contains_raw_xperience10m_data": false,
|
| 69 |
"contains_qwen_base_weights": false,
|
| 70 |
+
"contains_lora_weights": false,
|
| 71 |
+
"error_analysis": {
|
| 72 |
+
"status": "pass",
|
| 73 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 74 |
+
"markdown_report": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 75 |
+
"groupings": [
|
| 76 |
+
"episode",
|
| 77 |
+
"action_family",
|
| 78 |
+
"train_seen_status",
|
| 79 |
+
"required_modality_state",
|
| 80 |
+
"object_category"
|
| 81 |
+
],
|
| 82 |
+
"key_readouts": {
|
| 83 |
+
"parsed_prediction_rate": 0.8772321428571429,
|
| 84 |
+
"weakest_action_family": "locomotion",
|
| 85 |
+
"weakest_action_family_samples": 23,
|
| 86 |
+
"weakest_action_family_parsed_prediction_rate": 0.2608695652173913,
|
| 87 |
+
"seen_action_exact_rate": 0.04580152671755725,
|
| 88 |
+
"unseen_action_exact_rate": 0.015772870662460567,
|
| 89 |
+
"required_modality_state": "rrd_missing_only_required_modalities_present"
|
| 90 |
+
}
|
| 91 |
+
}
|
| 92 |
},
|
| 93 |
"required_next_steps": [
|
| 94 |
"Improve JSON-format reliability through prompt, decoding, constrained parsing, or target formatting changes.",
|
metrics/project_status.json
CHANGED
|
@@ -180,10 +180,12 @@
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
|
|
|
| 183 |
"scripts/omni/package_verified_omni_result.py",
|
| 184 |
-
"scripts/omni/audit_verified_omni_package.py"
|
|
|
|
| 185 |
],
|
| 186 |
-
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows,
|
| 187 |
},
|
| 188 |
{
|
| 189 |
"area": "Raw Xperience-10M redistribution",
|
|
|
|
| 180 |
"evidence": [
|
| 181 |
"docs/data/omni_finetune_verified_result.json",
|
| 182 |
"results/omni_finetune/verified_public/",
|
| 183 |
+
"results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/",
|
| 184 |
"scripts/omni/package_verified_omni_result.py",
|
| 185 |
+
"scripts/omni/audit_verified_omni_package.py",
|
| 186 |
+
"scripts/omni/analyze_qwen3_omni_errors.py"
|
| 187 |
],
|
| 188 |
+
"readout": "The selected 96/16/16 episode split produced a validation-aware public-safe held-out package with 3,808 exported windows, 512 validation windows, 448 test predictions, and derived error-analysis tables by episode, action family, train-seen status, required-modality state, and object category. JSON validity is 87.50%, below the 98% target, so it is a diagnostic baseline but not a strong model-quality result."
|
| 189 |
},
|
| 190 |
{
|
| 191 |
"area": "Raw Xperience-10M redistribution",
|
metrics/publication_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
@@ -182,8 +182,8 @@
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
-
"file_count":
|
| 186 |
-
"text_file_count":
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
@@ -193,8 +193,8 @@
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
-
"file_count":
|
| 197 |
-
"text_file_count":
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
@@ -204,8 +204,8 @@
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
-
"file_count":
|
| 208 |
-
"text_file_count":
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
@@ -215,8 +215,8 @@
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
-
"file_count":
|
| 219 |
-
"text_file_count":
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:02+00:00",
|
| 4 |
"checks": [
|
| 5 |
{
|
| 6 |
"name": "required_publication_assets_present",
|
|
|
|
| 182 |
"github_repo": {
|
| 183 |
"root": "repo",
|
| 184 |
"exists": true,
|
| 185 |
+
"file_count": 450,
|
| 186 |
+
"text_file_count": 380,
|
| 187 |
"largest_file": {
|
| 188 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 189 |
"bytes": 55702978
|
|
|
|
| 193 |
"hf_space_bundle": {
|
| 194 |
"root": "hf_publish/space",
|
| 195 |
"exists": true,
|
| 196 |
+
"file_count": 363,
|
| 197 |
+
"text_file_count": 293,
|
| 198 |
"largest_file": {
|
| 199 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 200 |
"bytes": 55702978
|
|
|
|
| 204 |
"hf_artifact_bundle": {
|
| 205 |
"root": "hf_publish/artifacts",
|
| 206 |
"exists": true,
|
| 207 |
+
"file_count": 522,
|
| 208 |
+
"text_file_count": 428,
|
| 209 |
"largest_file": {
|
| 210 |
"path": "results/episode_task_suite/modality_reconstruction/predictions.npz",
|
| 211 |
"bytes": 55702978
|
|
|
|
| 215 |
"hf_model_bundle": {
|
| 216 |
"root": "hf_publish/model",
|
| 217 |
"exists": true,
|
| 218 |
+
"file_count": 709,
|
| 219 |
+
"text_file_count": 580,
|
| 220 |
"largest_file": {
|
| 221 |
"path": "pytorch_model.bin",
|
| 222 |
"bytes": 93495480
|
metrics/scope_claims_audit.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"summary": {
|
| 5 |
"qwen3_omni_verified_diagnostic_pilot": true,
|
| 6 |
"dataset_manifest_num_episodes": 119,
|
metrics/task_surface_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
@@ -64,15 +64,21 @@
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
-
"name": "timeline_action:
|
| 68 |
"status": "pass",
|
| 69 |
-
"value": "
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
-
"name": "timeline_action:
|
| 74 |
"status": "pass",
|
| 75 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
"raw_hits": []
|
| 77 |
},
|
| 78 |
{
|
|
@@ -88,9 +94,9 @@
|
|
| 88 |
"raw_hits": []
|
| 89 |
},
|
| 90 |
{
|
| 91 |
-
"name": "timeline_action:
|
| 92 |
"status": "pass",
|
| 93 |
-
"value": "
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
|
@@ -99,12 +105,6 @@
|
|
| 99 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
-
{
|
| 103 |
-
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 104 |
-
"status": "pass",
|
| 105 |
-
"value": "window features -> action label builder -> classifier",
|
| 106 |
-
"raw_hits": []
|
| 107 |
-
},
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
@@ -184,15 +184,21 @@
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
-
"name": "timeline_subtask:
|
| 188 |
"status": "pass",
|
| 189 |
-
"value": "
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
-
"name": "timeline_subtask:
|
| 194 |
"status": "pass",
|
| 195 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 196 |
"raw_hits": []
|
| 197 |
},
|
| 198 |
{
|
|
@@ -208,9 +214,9 @@
|
|
| 208 |
"raw_hits": []
|
| 209 |
},
|
| 210 |
{
|
| 211 |
-
"name": "timeline_subtask:
|
| 212 |
"status": "pass",
|
| 213 |
-
"value": "
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
|
@@ -219,12 +225,6 @@
|
|
| 219 |
"value": "Predict the higher-level task stage for the current window.",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
-
{
|
| 223 |
-
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 224 |
-
"status": "pass",
|
| 225 |
-
"value": "window features -> subtask label builder -> classifier",
|
| 226 |
-
"raw_hits": []
|
| 227 |
-
},
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
@@ -304,15 +304,21 @@
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
-
"name": "transition_detection:
|
| 308 |
"status": "pass",
|
| 309 |
-
"value": "
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
-
"name": "transition_detection:
|
| 314 |
"status": "pass",
|
| 315 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 316 |
"raw_hits": []
|
| 317 |
},
|
| 318 |
{
|
|
@@ -328,9 +334,9 @@
|
|
| 328 |
"raw_hits": []
|
| 329 |
},
|
| 330 |
{
|
| 331 |
-
"name": "transition_detection:
|
| 332 |
"status": "pass",
|
| 333 |
-
"value": "
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
|
@@ -339,12 +345,6 @@
|
|
| 339 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
-
{
|
| 343 |
-
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 344 |
-
"status": "pass",
|
| 345 |
-
"value": "action changes -> boundary labels -> binary classifier",
|
| 346 |
-
"raw_hits": []
|
| 347 |
-
},
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
@@ -422,15 +422,21 @@
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
-
"name": "next_action:
|
| 426 |
"status": "pass",
|
| 427 |
-
"value": "
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
-
"name": "next_action:
|
| 432 |
"status": "pass",
|
| 433 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 434 |
"raw_hits": []
|
| 435 |
},
|
| 436 |
{
|
|
@@ -446,9 +452,9 @@
|
|
| 446 |
"raw_hits": []
|
| 447 |
},
|
| 448 |
{
|
| 449 |
-
"name": "next_action:
|
| 450 |
"status": "pass",
|
| 451 |
-
"value": "
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
|
@@ -457,12 +463,6 @@
|
|
| 457 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
-
{
|
| 461 |
-
"name": "next_action: public_field_process_short_is_human_readable",
|
| 462 |
-
"status": "pass",
|
| 463 |
-
"value": "current features -> future label shift -> classifier",
|
| 464 |
-
"raw_hits": []
|
| 465 |
-
},
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
@@ -540,15 +540,21 @@
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
-
"name": "hand_trajectory_forecast:
|
| 544 |
"status": "pass",
|
| 545 |
-
"value": "current multimodal
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
-
"name": "hand_trajectory_forecast:
|
| 550 |
"status": "pass",
|
| 551 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 552 |
"raw_hits": []
|
| 553 |
},
|
| 554 |
{
|
|
@@ -564,9 +570,9 @@
|
|
| 564 |
"raw_hits": []
|
| 565 |
},
|
| 566 |
{
|
| 567 |
-
"name": "hand_trajectory_forecast:
|
| 568 |
"status": "pass",
|
| 569 |
-
"value": "
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
|
@@ -575,12 +581,6 @@
|
|
| 575 |
"value": "Predict where the hands will move over the next few frames.",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
-
{
|
| 579 |
-
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 580 |
-
"status": "pass",
|
| 581 |
-
"value": "current features -> future mocap target -> regression head",
|
| 582 |
-
"raw_hits": []
|
| 583 |
-
},
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
@@ -658,15 +658,21 @@
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
-
"name": "contact_prediction:
|
| 662 |
"status": "pass",
|
| 663 |
-
"value": "
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
-
"name": "contact_prediction:
|
| 668 |
"status": "pass",
|
| 669 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 670 |
"raw_hits": []
|
| 671 |
},
|
| 672 |
{
|
|
@@ -682,9 +688,9 @@
|
|
| 682 |
"raw_hits": []
|
| 683 |
},
|
| 684 |
{
|
| 685 |
-
"name": "contact_prediction:
|
| 686 |
"status": "pass",
|
| 687 |
-
"value": "
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
|
@@ -693,12 +699,6 @@
|
|
| 693 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
-
{
|
| 697 |
-
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 698 |
-
"status": "pass",
|
| 699 |
-
"value": "feature filter -> contact target -> binary classifier",
|
| 700 |
-
"raw_hits": []
|
| 701 |
-
},
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
@@ -774,15 +774,21 @@
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
-
"name": "object_relevance:
|
| 778 |
"status": "pass",
|
| 779 |
-
"value": "non-caption
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
-
"name": "object_relevance:
|
| 784 |
"status": "pass",
|
| 785 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 786 |
"raw_hits": []
|
| 787 |
},
|
| 788 |
{
|
|
@@ -798,9 +804,9 @@
|
|
| 798 |
"raw_hits": []
|
| 799 |
},
|
| 800 |
{
|
| 801 |
-
"name": "object_relevance:
|
| 802 |
"status": "pass",
|
| 803 |
-
"value": "
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
|
@@ -809,12 +815,6 @@
|
|
| 809 |
"value": "Predict which objects matter in the current window.",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
-
{
|
| 813 |
-
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 814 |
-
"status": "pass",
|
| 815 |
-
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 816 |
-
"raw_hits": []
|
| 817 |
-
},
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
@@ -892,15 +892,21 @@
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
-
"name": "caption_grounding:
|
| 896 |
"status": "pass",
|
| 897 |
-
"value": "
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
-
"name": "caption_grounding:
|
| 902 |
"status": "pass",
|
| 903 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 904 |
"raw_hits": []
|
| 905 |
},
|
| 906 |
{
|
|
@@ -916,9 +922,9 @@
|
|
| 916 |
"raw_hits": []
|
| 917 |
},
|
| 918 |
{
|
| 919 |
-
"name": "caption_grounding:
|
| 920 |
"status": "pass",
|
| 921 |
-
"value": "
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
|
@@ -927,12 +933,6 @@
|
|
| 927 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
-
{
|
| 931 |
-
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 932 |
-
"status": "pass",
|
| 933 |
-
"value": "query features -> candidate index -> cosine ranker",
|
| 934 |
-
"raw_hits": []
|
| 935 |
-
},
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
@@ -1008,15 +1008,21 @@
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
-
"name": "cross_modal_retrieval:
|
| 1012 |
"status": "pass",
|
| 1013 |
-
"value": "motion
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
-
"name": "cross_modal_retrieval:
|
| 1018 |
"status": "pass",
|
| 1019 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1020 |
"raw_hits": []
|
| 1021 |
},
|
| 1022 |
{
|
|
@@ -1032,9 +1038,9 @@
|
|
| 1032 |
"raw_hits": []
|
| 1033 |
},
|
| 1034 |
{
|
| 1035 |
-
"name": "cross_modal_retrieval:
|
| 1036 |
"status": "pass",
|
| 1037 |
-
"value": "
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
|
@@ -1043,12 +1049,6 @@
|
|
| 1043 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
-
{
|
| 1047 |
-
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1048 |
-
"status": "pass",
|
| 1049 |
-
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1050 |
-
"raw_hits": []
|
| 1051 |
-
},
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
@@ -1126,15 +1126,21 @@
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
-
"name": "modality_reconstruction:
|
| 1130 |
"status": "pass",
|
| 1131 |
-
"value": "motion, IMU, and camera
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
-
"name": "modality_reconstruction:
|
| 1136 |
"status": "pass",
|
| 1137 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1138 |
"raw_hits": []
|
| 1139 |
},
|
| 1140 |
{
|
|
@@ -1150,9 +1156,9 @@
|
|
| 1150 |
"raw_hits": []
|
| 1151 |
},
|
| 1152 |
{
|
| 1153 |
-
"name": "modality_reconstruction:
|
| 1154 |
"status": "pass",
|
| 1155 |
-
"value": "
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
|
@@ -1161,12 +1167,6 @@
|
|
| 1161 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
-
{
|
| 1165 |
-
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1166 |
-
"status": "pass",
|
| 1167 |
-
"value": "source-target split -> scaler -> regression head",
|
| 1168 |
-
"raw_hits": []
|
| 1169 |
-
},
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
@@ -1243,12 +1243,6 @@
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
| 1246 |
-
{
|
| 1247 |
-
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1248 |
-
"status": "pass",
|
| 1249 |
-
"value": "two adjacent windows plus difference vector",
|
| 1250 |
-
"raw_hits": []
|
| 1251 |
-
},
|
| 1252 |
{
|
| 1253 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1254 |
"status": "pass",
|
|
@@ -1256,27 +1250,27 @@
|
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
-
"name": "temporal_order:
|
| 1260 |
"status": "pass",
|
| 1261 |
"value": "Temporal Order Verification",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
-
"name": "temporal_order:
|
| 1266 |
"status": "pass",
|
| 1267 |
-
"value": "
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
-
"name": "temporal_order:
|
| 1272 |
"status": "pass",
|
| 1273 |
"value": "Temporal Order Verification",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
| 1277 |
-
"name": "temporal_order:
|
| 1278 |
"status": "pass",
|
| 1279 |
-
"value": "
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
{
|
|
@@ -1285,6 +1279,12 @@
|
|
| 1285 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1286 |
"raw_hits": []
|
| 1287 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
@@ -1360,15 +1360,21 @@
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
-
"name": "misalignment_detection:
|
| 1364 |
"status": "pass",
|
| 1365 |
-
"value": "motion
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
-
"name": "misalignment_detection:
|
| 1370 |
"status": "pass",
|
| 1371 |
-
"value": "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1372 |
"raw_hits": []
|
| 1373 |
},
|
| 1374 |
{
|
|
@@ -1384,9 +1390,9 @@
|
|
| 1384 |
"raw_hits": []
|
| 1385 |
},
|
| 1386 |
{
|
| 1387 |
-
"name": "misalignment_detection:
|
| 1388 |
"status": "pass",
|
| 1389 |
-
"value": "
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
|
@@ -1395,12 +1401,6 @@
|
|
| 1395 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
-
{
|
| 1399 |
-
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1400 |
-
"status": "pass",
|
| 1401 |
-
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1402 |
-
"raw_hits": []
|
| 1403 |
-
},
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:53:59+00:00",
|
| 4 |
"summary": {
|
| 5 |
"task_count": 12,
|
| 6 |
"expected_task_count": 12,
|
|
|
|
| 64 |
"observed": "timeline_action"
|
| 65 |
},
|
| 66 |
{
|
| 67 |
+
"name": "timeline_action: public_field_card_blurb_is_human_readable",
|
| 68 |
"status": "pass",
|
| 69 |
+
"value": "Recognize the current manipulation action from synchronized visual, motion, inertial, pose, and annotation context.",
|
| 70 |
"raw_hits": []
|
| 71 |
},
|
| 72 |
{
|
| 73 |
+
"name": "timeline_action: public_field_research_name_is_human_readable",
|
| 74 |
"status": "pass",
|
| 75 |
+
"value": "Egocentric Action Recognition",
|
| 76 |
+
"raw_hits": []
|
| 77 |
+
},
|
| 78 |
+
{
|
| 79 |
+
"name": "timeline_action: public_field_input_short_is_human_readable",
|
| 80 |
+
"status": "pass",
|
| 81 |
+
"value": "20-frame multimodal window",
|
| 82 |
"raw_hits": []
|
| 83 |
},
|
| 84 |
{
|
|
|
|
| 94 |
"raw_hits": []
|
| 95 |
},
|
| 96 |
{
|
| 97 |
+
"name": "timeline_action: public_field_process_short_is_human_readable",
|
| 98 |
"status": "pass",
|
| 99 |
+
"value": "window features -> action label builder -> classifier",
|
| 100 |
"raw_hits": []
|
| 101 |
},
|
| 102 |
{
|
|
|
|
| 105 |
"value": "Look at one short multimodal window and name what action is happening now.",
|
| 106 |
"raw_hits": []
|
| 107 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
{
|
| 109 |
"name": "timeline_action: known_task_family",
|
| 110 |
"status": "pass",
|
|
|
|
| 184 |
"observed": "timeline_subtask"
|
| 185 |
},
|
| 186 |
{
|
| 187 |
+
"name": "timeline_subtask: public_field_card_blurb_is_human_readable",
|
| 188 |
"status": "pass",
|
| 189 |
+
"value": "Recognize the broader activity stage so fine actions become a readable procedure timeline.",
|
| 190 |
"raw_hits": []
|
| 191 |
},
|
| 192 |
{
|
| 193 |
+
"name": "timeline_subtask: public_field_research_name_is_human_readable",
|
| 194 |
"status": "pass",
|
| 195 |
+
"value": "Temporal Subtask Recognition",
|
| 196 |
+
"raw_hits": []
|
| 197 |
+
},
|
| 198 |
+
{
|
| 199 |
+
"name": "timeline_subtask: public_field_input_short_is_human_readable",
|
| 200 |
+
"status": "pass",
|
| 201 |
+
"value": "20-frame multimodal window",
|
| 202 |
"raw_hits": []
|
| 203 |
},
|
| 204 |
{
|
|
|
|
| 214 |
"raw_hits": []
|
| 215 |
},
|
| 216 |
{
|
| 217 |
+
"name": "timeline_subtask: public_field_process_short_is_human_readable",
|
| 218 |
"status": "pass",
|
| 219 |
+
"value": "window features -> subtask label builder -> classifier",
|
| 220 |
"raw_hits": []
|
| 221 |
},
|
| 222 |
{
|
|
|
|
| 225 |
"value": "Predict the higher-level task stage for the current window.",
|
| 226 |
"raw_hits": []
|
| 227 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 228 |
{
|
| 229 |
"name": "timeline_subtask: known_task_family",
|
| 230 |
"status": "pass",
|
|
|
|
| 304 |
"observed": "transition_detection"
|
| 305 |
},
|
| 306 |
{
|
| 307 |
+
"name": "transition_detection: public_field_card_blurb_is_human_readable",
|
| 308 |
"status": "pass",
|
| 309 |
+
"value": "Detect the local moment where the episode changes from one action segment to the next.",
|
| 310 |
"raw_hits": []
|
| 311 |
},
|
| 312 |
{
|
| 313 |
+
"name": "transition_detection: public_field_research_name_is_human_readable",
|
| 314 |
"status": "pass",
|
| 315 |
+
"value": "Temporal Action Segmentation",
|
| 316 |
+
"raw_hits": []
|
| 317 |
+
},
|
| 318 |
+
{
|
| 319 |
+
"name": "transition_detection: public_field_input_short_is_human_readable",
|
| 320 |
+
"status": "pass",
|
| 321 |
+
"value": "current window with boundary target",
|
| 322 |
"raw_hits": []
|
| 323 |
},
|
| 324 |
{
|
|
|
|
| 334 |
"raw_hits": []
|
| 335 |
},
|
| 336 |
{
|
| 337 |
+
"name": "transition_detection: public_field_process_short_is_human_readable",
|
| 338 |
"status": "pass",
|
| 339 |
+
"value": "action changes -> boundary labels -> binary classifier",
|
| 340 |
"raw_hits": []
|
| 341 |
},
|
| 342 |
{
|
|
|
|
| 345 |
"value": "Detect whether the current window is near a boundary between actions.",
|
| 346 |
"raw_hits": []
|
| 347 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 348 |
{
|
| 349 |
"name": "transition_detection: known_task_family",
|
| 350 |
"status": "pass",
|
|
|
|
| 422 |
"observed": "next_action"
|
| 423 |
},
|
| 424 |
{
|
| 425 |
+
"name": "next_action: public_field_card_blurb_is_human_readable",
|
| 426 |
"status": "pass",
|
| 427 |
+
"value": "Forecast the near-future action from the current observations only.",
|
| 428 |
"raw_hits": []
|
| 429 |
},
|
| 430 |
{
|
| 431 |
+
"name": "next_action: public_field_research_name_is_human_readable",
|
| 432 |
"status": "pass",
|
| 433 |
+
"value": "Short-Horizon Intention Prediction",
|
| 434 |
+
"raw_hits": []
|
| 435 |
+
},
|
| 436 |
+
{
|
| 437 |
+
"name": "next_action: public_field_input_short_is_human_readable",
|
| 438 |
+
"status": "pass",
|
| 439 |
+
"value": "current window at time t",
|
| 440 |
"raw_hits": []
|
| 441 |
},
|
| 442 |
{
|
|
|
|
| 452 |
"raw_hits": []
|
| 453 |
},
|
| 454 |
{
|
| 455 |
+
"name": "next_action: public_field_process_short_is_human_readable",
|
| 456 |
"status": "pass",
|
| 457 |
+
"value": "current features -> future label shift -> classifier",
|
| 458 |
"raw_hits": []
|
| 459 |
},
|
| 460 |
{
|
|
|
|
| 463 |
"value": "Use the current window to guess the action that will happen shortly after it.",
|
| 464 |
"raw_hits": []
|
| 465 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 466 |
{
|
| 467 |
"name": "next_action: known_task_family",
|
| 468 |
"status": "pass",
|
|
|
|
| 540 |
"observed": "hand_trajectory_forecast"
|
| 541 |
},
|
| 542 |
{
|
| 543 |
+
"name": "hand_trajectory_forecast: public_field_card_blurb_is_human_readable",
|
| 544 |
"status": "pass",
|
| 545 |
+
"value": "Predict the future 3D left/right hand path from the current multimodal state.",
|
| 546 |
"raw_hits": []
|
| 547 |
},
|
| 548 |
{
|
| 549 |
+
"name": "hand_trajectory_forecast: public_field_research_name_is_human_readable",
|
| 550 |
"status": "pass",
|
| 551 |
+
"value": "3D Hand Motion Forecasting",
|
| 552 |
+
"raw_hits": []
|
| 553 |
+
},
|
| 554 |
+
{
|
| 555 |
+
"name": "hand_trajectory_forecast: public_field_input_short_is_human_readable",
|
| 556 |
+
"status": "pass",
|
| 557 |
+
"value": "current multimodal window",
|
| 558 |
"raw_hits": []
|
| 559 |
},
|
| 560 |
{
|
|
|
|
| 570 |
"raw_hits": []
|
| 571 |
},
|
| 572 |
{
|
| 573 |
+
"name": "hand_trajectory_forecast: public_field_process_short_is_human_readable",
|
| 574 |
"status": "pass",
|
| 575 |
+
"value": "current features -> future mocap target -> regression head",
|
| 576 |
"raw_hits": []
|
| 577 |
},
|
| 578 |
{
|
|
|
|
| 581 |
"value": "Predict where the hands will move over the next few frames.",
|
| 582 |
"raw_hits": []
|
| 583 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 584 |
{
|
| 585 |
"name": "hand_trajectory_forecast: known_task_family",
|
| 586 |
"status": "pass",
|
|
|
|
| 658 |
"observed": "contact_prediction"
|
| 659 |
},
|
| 660 |
{
|
| 661 |
+
"name": "contact_prediction: public_field_card_blurb_is_human_readable",
|
| 662 |
"status": "pass",
|
| 663 |
+
"value": "Predict whether body or hand contact with the scene is occurring without leaking contact labels.",
|
| 664 |
"raw_hits": []
|
| 665 |
},
|
| 666 |
{
|
| 667 |
+
"name": "contact_prediction: public_field_research_name_is_human_readable",
|
| 668 |
"status": "pass",
|
| 669 |
+
"value": "Human-Object Contact Prediction",
|
| 670 |
+
"raw_hits": []
|
| 671 |
+
},
|
| 672 |
+
{
|
| 673 |
+
"name": "contact_prediction: public_field_input_short_is_human_readable",
|
| 674 |
+
"status": "pass",
|
| 675 |
+
"value": "non-contact, non-caption features",
|
| 676 |
"raw_hits": []
|
| 677 |
},
|
| 678 |
{
|
|
|
|
| 688 |
"raw_hits": []
|
| 689 |
},
|
| 690 |
{
|
| 691 |
+
"name": "contact_prediction: public_field_process_short_is_human_readable",
|
| 692 |
"status": "pass",
|
| 693 |
+
"value": "feature filter -> contact target -> binary classifier",
|
| 694 |
"raw_hits": []
|
| 695 |
},
|
| 696 |
{
|
|
|
|
| 699 |
"value": "Predict whether the body or hand is in contact with something.",
|
| 700 |
"raw_hits": []
|
| 701 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 702 |
{
|
| 703 |
"name": "contact_prediction: known_task_family",
|
| 704 |
"status": "pass",
|
|
|
|
| 774 |
"observed": "object_relevance"
|
| 775 |
},
|
| 776 |
{
|
| 777 |
+
"name": "object_relevance: public_field_card_blurb_is_human_readable",
|
| 778 |
"status": "pass",
|
| 779 |
+
"value": "Infer which objects are relevant to the current manipulation window from non-caption features.",
|
| 780 |
"raw_hits": []
|
| 781 |
},
|
| 782 |
{
|
| 783 |
+
"name": "object_relevance: public_field_research_name_is_human_readable",
|
| 784 |
"status": "pass",
|
| 785 |
+
"value": "Object-Centric Interaction Recognition",
|
| 786 |
+
"raw_hits": []
|
| 787 |
+
},
|
| 788 |
+
{
|
| 789 |
+
"name": "object_relevance: public_field_input_short_is_human_readable",
|
| 790 |
+
"status": "pass",
|
| 791 |
+
"value": "non-caption multimodal features",
|
| 792 |
"raw_hits": []
|
| 793 |
},
|
| 794 |
{
|
|
|
|
| 804 |
"raw_hits": []
|
| 805 |
},
|
| 806 |
{
|
| 807 |
+
"name": "object_relevance: public_field_process_short_is_human_readable",
|
| 808 |
"status": "pass",
|
| 809 |
+
"value": "object vocabulary -> multi-hot labels -> sigmoid heads",
|
| 810 |
"raw_hits": []
|
| 811 |
},
|
| 812 |
{
|
|
|
|
| 815 |
"value": "Predict which objects matter in the current window.",
|
| 816 |
"raw_hits": []
|
| 817 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 818 |
{
|
| 819 |
"name": "object_relevance: known_task_family",
|
| 820 |
"status": "pass",
|
|
|
|
| 892 |
"observed": "caption_grounding"
|
| 893 |
},
|
| 894 |
{
|
| 895 |
+
"name": "caption_grounding: public_field_card_blurb_is_human_readable",
|
| 896 |
"status": "pass",
|
| 897 |
+
"value": "Retrieve the matching time window for an annotation-derived text query.",
|
| 898 |
"raw_hits": []
|
| 899 |
},
|
| 900 |
{
|
| 901 |
+
"name": "caption_grounding: public_field_research_name_is_human_readable",
|
| 902 |
"status": "pass",
|
| 903 |
+
"value": "Language-to-Moment Grounding",
|
| 904 |
+
"raw_hits": []
|
| 905 |
+
},
|
| 906 |
+
{
|
| 907 |
+
"name": "caption_grounding: public_field_input_short_is_human_readable",
|
| 908 |
+
"status": "pass",
|
| 909 |
+
"value": "text-like query and candidate windows",
|
| 910 |
"raw_hits": []
|
| 911 |
},
|
| 912 |
{
|
|
|
|
| 922 |
"raw_hits": []
|
| 923 |
},
|
| 924 |
{
|
| 925 |
+
"name": "caption_grounding: public_field_process_short_is_human_readable",
|
| 926 |
"status": "pass",
|
| 927 |
+
"value": "query features -> candidate index -> cosine ranker",
|
| 928 |
"raw_hits": []
|
| 929 |
},
|
| 930 |
{
|
|
|
|
| 933 |
"value": "Given a text-like query from annotation, find the matching time window.",
|
| 934 |
"raw_hits": []
|
| 935 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 936 |
{
|
| 937 |
"name": "caption_grounding: known_task_family",
|
| 938 |
"status": "pass",
|
|
|
|
| 1008 |
"observed": "cross_modal_retrieval"
|
| 1009 |
},
|
| 1010 |
{
|
| 1011 |
+
"name": "cross_modal_retrieval: public_field_card_blurb_is_human_readable",
|
| 1012 |
"status": "pass",
|
| 1013 |
+
"value": "Use motion, IMU, and camera-pose signals to retrieve the matching depth/video window.",
|
| 1014 |
"raw_hits": []
|
| 1015 |
},
|
| 1016 |
{
|
| 1017 |
+
"name": "cross_modal_retrieval: public_field_research_name_is_human_readable",
|
| 1018 |
"status": "pass",
|
| 1019 |
+
"value": "Multimodal Representation Retrieval",
|
| 1020 |
+
"raw_hits": []
|
| 1021 |
+
},
|
| 1022 |
+
{
|
| 1023 |
+
"name": "cross_modal_retrieval: public_field_input_short_is_human_readable",
|
| 1024 |
+
"status": "pass",
|
| 1025 |
+
"value": "motion/IMU/pose query; depth/video candidates",
|
| 1026 |
"raw_hits": []
|
| 1027 |
},
|
| 1028 |
{
|
|
|
|
| 1038 |
"raw_hits": []
|
| 1039 |
},
|
| 1040 |
{
|
| 1041 |
+
"name": "cross_modal_retrieval: public_field_process_short_is_human_readable",
|
| 1042 |
"status": "pass",
|
| 1043 |
+
"value": "modality split -> projection -> nearest-neighbor ranker",
|
| 1044 |
"raw_hits": []
|
| 1045 |
},
|
| 1046 |
{
|
|
|
|
| 1049 |
"value": "Use one group of modalities to retrieve the matching window from another group.",
|
| 1050 |
"raw_hits": []
|
| 1051 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1052 |
{
|
| 1053 |
"name": "cross_modal_retrieval: known_task_family",
|
| 1054 |
"status": "pass",
|
|
|
|
| 1126 |
"observed": "modality_reconstruction"
|
| 1127 |
},
|
| 1128 |
{
|
| 1129 |
+
"name": "modality_reconstruction: public_field_card_blurb_is_human_readable",
|
| 1130 |
"status": "pass",
|
| 1131 |
+
"value": "Predict compressed depth/video feature vectors from motion, IMU, and camera-pose features.",
|
| 1132 |
"raw_hits": []
|
| 1133 |
},
|
| 1134 |
{
|
| 1135 |
+
"name": "modality_reconstruction: public_field_research_name_is_human_readable",
|
| 1136 |
"status": "pass",
|
| 1137 |
+
"value": "Modality Feature Reconstruction",
|
| 1138 |
+
"raw_hits": []
|
| 1139 |
+
},
|
| 1140 |
+
{
|
| 1141 |
+
"name": "modality_reconstruction: public_field_input_short_is_human_readable",
|
| 1142 |
+
"status": "pass",
|
| 1143 |
+
"value": "motion, IMU, and camera/pose features",
|
| 1144 |
"raw_hits": []
|
| 1145 |
},
|
| 1146 |
{
|
|
|
|
| 1156 |
"raw_hits": []
|
| 1157 |
},
|
| 1158 |
{
|
| 1159 |
+
"name": "modality_reconstruction: public_field_process_short_is_human_readable",
|
| 1160 |
"status": "pass",
|
| 1161 |
+
"value": "source-target split -> scaler -> regression head",
|
| 1162 |
"raw_hits": []
|
| 1163 |
},
|
| 1164 |
{
|
|
|
|
| 1167 |
"value": "Predict one modality feature block from other modality blocks.",
|
| 1168 |
"raw_hits": []
|
| 1169 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1170 |
{
|
| 1171 |
"name": "modality_reconstruction: known_task_family",
|
| 1172 |
"status": "pass",
|
|
|
|
| 1243 |
"status": "pass",
|
| 1244 |
"observed": "temporal_order"
|
| 1245 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1246 |
{
|
| 1247 |
"name": "temporal_order: public_field_card_blurb_is_human_readable",
|
| 1248 |
"status": "pass",
|
|
|
|
| 1250 |
"raw_hits": []
|
| 1251 |
},
|
| 1252 |
{
|
| 1253 |
+
"name": "temporal_order: public_field_research_name_is_human_readable",
|
| 1254 |
"status": "pass",
|
| 1255 |
"value": "Temporal Order Verification",
|
| 1256 |
"raw_hits": []
|
| 1257 |
},
|
| 1258 |
{
|
| 1259 |
+
"name": "temporal_order: public_field_input_short_is_human_readable",
|
| 1260 |
"status": "pass",
|
| 1261 |
+
"value": "two adjacent windows plus difference vector",
|
| 1262 |
"raw_hits": []
|
| 1263 |
},
|
| 1264 |
{
|
| 1265 |
+
"name": "temporal_order: public_field_display_name_is_human_readable",
|
| 1266 |
"status": "pass",
|
| 1267 |
"value": "Temporal Order Verification",
|
| 1268 |
"raw_hits": []
|
| 1269 |
},
|
| 1270 |
{
|
| 1271 |
+
"name": "temporal_order: public_field_output_short_is_human_readable",
|
| 1272 |
"status": "pass",
|
| 1273 |
+
"value": "correct or reversed",
|
| 1274 |
"raw_hits": []
|
| 1275 |
},
|
| 1276 |
{
|
|
|
|
| 1279 |
"value": "pair builder -> feature combiner -> binary classifier",
|
| 1280 |
"raw_hits": []
|
| 1281 |
},
|
| 1282 |
+
{
|
| 1283 |
+
"name": "temporal_order: public_field_plain_goal_is_human_readable",
|
| 1284 |
+
"status": "pass",
|
| 1285 |
+
"value": "Tell whether two nearby windows are in the correct time order.",
|
| 1286 |
+
"raw_hits": []
|
| 1287 |
+
},
|
| 1288 |
{
|
| 1289 |
"name": "temporal_order: known_task_family",
|
| 1290 |
"status": "pass",
|
|
|
|
| 1360 |
"observed": "misalignment_detection"
|
| 1361 |
},
|
| 1362 |
{
|
| 1363 |
+
"name": "misalignment_detection: public_field_card_blurb_is_human_readable",
|
| 1364 |
"status": "pass",
|
| 1365 |
+
"value": "Detect whether motion and visual/depth streams have been artificially shifted out of sync.",
|
| 1366 |
"raw_hits": []
|
| 1367 |
},
|
| 1368 |
{
|
| 1369 |
+
"name": "misalignment_detection: public_field_research_name_is_human_readable",
|
| 1370 |
"status": "pass",
|
| 1371 |
+
"value": "Cross-Modal Misalignment Detection",
|
| 1372 |
+
"raw_hits": []
|
| 1373 |
+
},
|
| 1374 |
+
{
|
| 1375 |
+
"name": "misalignment_detection: public_field_input_short_is_human_readable",
|
| 1376 |
+
"status": "pass",
|
| 1377 |
+
"value": "motion-side and visual/depth-side feature groups",
|
| 1378 |
"raw_hits": []
|
| 1379 |
},
|
| 1380 |
{
|
|
|
|
| 1390 |
"raw_hits": []
|
| 1391 |
},
|
| 1392 |
{
|
| 1393 |
+
"name": "misalignment_detection: public_field_process_short_is_human_readable",
|
| 1394 |
"status": "pass",
|
| 1395 |
+
"value": "aligned/shifted pairs -> feature combiner -> binary classifier",
|
| 1396 |
"raw_hits": []
|
| 1397 |
},
|
| 1398 |
{
|
|
|
|
| 1401 |
"value": "Detect when modalities that should match are shifted out of sync.",
|
| 1402 |
"raw_hits": []
|
| 1403 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1404 |
{
|
| 1405 |
"name": "misalignment_detection: known_task_family",
|
| 1406 |
"status": "pass",
|
metrics/website_integrity.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
-
"generated_at_utc": "2026-06-06T14:
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
@@ -251,7 +251,7 @@
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
-
"bytes":
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
@@ -291,7 +291,7 @@
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
-
"bytes":
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
@@ -301,7 +301,7 @@
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
-
"bytes":
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
@@ -321,7 +321,7 @@
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
-
"bytes":
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
|
|
|
| 1 |
{
|
| 2 |
"status": "pass",
|
| 3 |
+
"generated_at_utc": "2026-06-06T14:54:01+00:00",
|
| 4 |
"docs_root": "docs",
|
| 5 |
"site_base": "/ropedia-xperience-10m-task-suite/",
|
| 6 |
"summary": {
|
|
|
|
| 251 |
},
|
| 252 |
{
|
| 253 |
"path": "data/artifact_index.json",
|
| 254 |
+
"bytes": 39486,
|
| 255 |
"top_level_type": "dict"
|
| 256 |
},
|
| 257 |
{
|
|
|
|
| 291 |
},
|
| 292 |
{
|
| 293 |
"path": "data/mirror_parity.json",
|
| 294 |
+
"bytes": 126335,
|
| 295 |
"top_level_type": "dict"
|
| 296 |
},
|
| 297 |
{
|
|
|
|
| 301 |
},
|
| 302 |
{
|
| 303 |
"path": "data/omni_finetune_verified_result.json",
|
| 304 |
+
"bytes": 4142,
|
| 305 |
"top_level_type": "dict"
|
| 306 |
},
|
| 307 |
{
|
|
|
|
| 321 |
},
|
| 322 |
{
|
| 323 |
"path": "data/project_status.json",
|
| 324 |
+
"bytes": 11274,
|
| 325 |
"top_level_type": "dict"
|
| 326 |
},
|
| 327 |
{
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/PUBLIC_RESULT_SUMMARY.md
CHANGED
|
@@ -22,4 +22,22 @@
|
|
| 22 |
|
| 23 |
Raw Xperience-10M files, base-model weights, adapter or checkpoint weights, full checkpoints, and large archives are not included.
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
Use this package as the source for README, website, and Hugging Face updates.
|
|
|
|
| 22 |
|
| 23 |
Raw Xperience-10M files, base-model weights, adapter or checkpoint weights, full checkpoints, and large archives are not included.
|
| 24 |
|
| 25 |
+
## Error Analysis
|
| 26 |
+
|
| 27 |
+
The package includes a derived held-out error analysis under `analysis/`. It
|
| 28 |
+
groups the 448 public prediction rows by episode, coarse action family,
|
| 29 |
+
train-seen status, required-modality state, and object category.
|
| 30 |
+
|
| 31 |
+
Key readouts:
|
| 32 |
+
|
| 33 |
+
- Official JSON validity from `metrics.json`: `0.8750`
|
| 34 |
+
- Parsed prediction rate from public rows: `0.8772`
|
| 35 |
+
- Weakest action family by parsed prediction rate: `locomotion` with 23 rows and `0.2609`
|
| 36 |
+
- Train-seen split: seen labels have `0.0458` action exact rate; unseen labels have `0.0158`
|
| 37 |
+
- Required-modality state: all held-out rows have required modalities present, with only `visualization.rrd` absent
|
| 38 |
+
|
| 39 |
+
Use `analysis/ERROR_ANALYSIS.md` and
|
| 40 |
+
`analysis/error_analysis_summary.json` before planning the next
|
| 41 |
+
structured-output pass.
|
| 42 |
+
|
| 43 |
Use this package as the source for README, website, and Hugging Face updates.
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Qwen3-Omni Held-Out Error Analysis
|
| 2 |
+
|
| 3 |
+
This report is computed from the verified public package predictions. It contains only derived metrics and sanitized examples.
|
| 4 |
+
|
| 5 |
+
## Overall
|
| 6 |
+
|
| 7 |
+
- Prediction rows: `448`
|
| 8 |
+
- JSON validity from `metrics.json`: `0.8750`
|
| 9 |
+
- Parsed prediction rate from public rows: `0.8772`
|
| 10 |
+
- Action exact rate: `0.0246`
|
| 11 |
+
- Subtask exact rate: `0.0067`
|
| 12 |
+
- Contact exact rate: `0.6451`
|
| 13 |
+
- Object F1: `0.2230`
|
| 14 |
+
|
| 15 |
+
## Weakest Episode Groups
|
| 16 |
+
|
| 17 |
+
| group | samples | parsed_prediction_rate | action_exact_rate | object_f1 |
|
| 18 |
+
| --- | --- | --- | --- | --- |
|
| 19 |
+
| 1796b943-caad-43c6-b9bd-80b8d601f37d__ep1 | 32 | 0.5625 | 0.0000 | 0.0459 |
|
| 20 |
+
| 8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1 | 32 | 0.7500 | 0.0312 | 0.0942 |
|
| 21 |
+
| 33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1 | 32 | 0.8438 | 0.0000 | 0.0529 |
|
| 22 |
+
| b750fab3-7fbb-43a0-b451-c64c4d4a64da__ep1 | 32 | 0.8438 | 0.0000 | 0.2353 |
|
| 23 |
+
| ba18b7c1-21ff-45da-8452-41acce7fc8de__ep2 | 32 | 0.8438 | 0.0000 | 0.2836 |
|
| 24 |
+
| ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2 | 32 | 0.8438 | 0.0625 | 0.0746 |
|
| 25 |
+
| b9dd769b-e31a-4fdb-945e-5a60db6487b0__ep2 | 32 | 0.8750 | 0.0312 | 0.3265 |
|
| 26 |
+
| 4b02bb38-384a-438a-b5f9-6131d85c34b0__ep1 | 32 | 0.8750 | 0.0938 | 0.2830 |
|
| 27 |
+
|
| 28 |
+
## Action Families
|
| 29 |
+
|
| 30 |
+
| group | samples | parsed_prediction_rate | action_exact_rate | subtask_exact_rate | object_f1 |
|
| 31 |
+
| --- | --- | --- | --- | --- | --- |
|
| 32 |
+
| locomotion | 23 | 0.2609 | 0.0000 | 0.0000 | 0.0120 |
|
| 33 |
+
| food_kitchen | 5 | 0.6000 | 0.2000 | 0.0000 | 0.2727 |
|
| 34 |
+
| cleaning | 8 | 0.7500 | 0.0000 | 0.0000 | 0.0000 |
|
| 35 |
+
| other | 94 | 0.8511 | 0.0000 | 0.0000 | 0.1910 |
|
| 36 |
+
| phone_use | 51 | 0.9020 | 0.0588 | 0.0196 | 0.3501 |
|
| 37 |
+
| paper_cardboard_craft | 142 | 0.9225 | 0.0282 | 0.0141 | 0.2308 |
|
| 38 |
+
| small_object_sorting | 87 | 0.9655 | 0.0000 | 0.0000 | 0.2740 |
|
| 39 |
+
| retail_stocking | 38 | 0.9737 | 0.0789 | 0.0000 | 0.1564 |
|
| 40 |
+
|
| 41 |
+
## Train-Seen Split
|
| 42 |
+
|
| 43 |
+
| group | samples | parsed_prediction_rate | action_exact_rate | next_action_exact_rate |
|
| 44 |
+
| --- | --- | --- | --- | --- |
|
| 45 |
+
| unseen_in_train | 317 | 0.8454 | 0.0158 | 0.0158 |
|
| 46 |
+
| seen_in_train | 131 | 0.9542 | 0.0458 | 0.0458 |
|
| 47 |
+
|
| 48 |
+
## Required-Modality State
|
| 49 |
+
|
| 50 |
+
| group | samples | parsed_prediction_rate | action_exact_rate | object_f1 |
|
| 51 |
+
| --- | --- | --- | --- | --- |
|
| 52 |
+
| rrd_missing_only_required_modalities_present | 448 | 0.8772 | 0.0246 | 0.2230 |
|
| 53 |
+
|
| 54 |
+
## Object Categories
|
| 55 |
+
|
| 56 |
+
| group | samples | object_precision | object_recall | object_f1 |
|
| 57 |
+
| --- | --- | --- | --- | --- |
|
| 58 |
+
| furniture_room | 96 | 0.2534 | 0.2334 | 0.2430 |
|
| 59 |
+
| other_object | 135 | 0.1372 | 0.1643 | 0.1495 |
|
| 60 |
+
| food_kitchen | 56 | 0.2228 | 0.2000 | 0.2108 |
|
| 61 |
+
| cleaning | 8 | 0.0400 | 0.0476 | 0.0435 |
|
| 62 |
+
| phone_device | 162 | 0.3252 | 0.3132 | 0.3191 |
|
| 63 |
+
| paper_cardboard | 261 | 0.2227 | 0.3234 | 0.2638 |
|
| 64 |
+
| craft_small_object | 106 | 0.2266 | 0.2581 | 0.2413 |
|
| 65 |
+
| retail_container | 101 | 0.2028 | 0.1752 | 0.1880 |
|
| 66 |
+
|
| 67 |
+
## Interpretation
|
| 68 |
+
|
| 69 |
+
The diagnostic pilot is dominated by invalid or weak structured outputs and exact-label failures. These tables identify where to tighten JSON constraints, action/subtask target formatting, object vocabularies, and missing-modality robustness before claiming stronger model quality.
|
| 70 |
+
|
| 71 |
+
Generated files:
|
| 72 |
+
|
| 73 |
+
- `error_analysis_summary.json`
|
| 74 |
+
- `episode_error_analysis.csv`
|
| 75 |
+
- `action_family_error_analysis.csv`
|
| 76 |
+
- `train_seen_error_analysis.csv`
|
| 77 |
+
- `missing_modality_error_analysis.csv`
|
| 78 |
+
- `object_category_error_analysis.csv`
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
group,samples,parsed_prediction_rate,action_exact_rate,subtask_exact_rate,transition_exact_rate,next_action_exact_rate,contact_exact_rate,object_precision,object_recall,object_f1
|
| 2 |
+
locomotion,23,0.2608695652173913,0.0,0.0,0.2608695652173913,0.0,0.08695652173913043,0.010752688172043012,0.0136986301369863,0.012048192771084338
|
| 3 |
+
food_kitchen,5,0.6,0.2,0.0,0.6,0.2,0.2,0.375,0.21428571428571427,0.2727272727272727
|
| 4 |
+
cleaning,8,0.75,0.0,0.0,0.625,0.0,0.625,0.0,0.0,0.0
|
| 5 |
+
other,94,0.851063829787234,0.0,0.0,0.8085106382978723,0.0,0.6063829787234043,0.17220543806646527,0.21428571428571427,0.19095477386934673
|
| 6 |
+
phone_use,51,0.9019607843137255,0.058823529411764705,0.0196078431372549,0.8431372549019608,0.058823529411764705,0.5686274509803921,0.35542168674698793,0.34502923976608185,0.3501483679525222
|
| 7 |
+
paper_cardboard_craft,142,0.9225352112676056,0.028169014084507043,0.014084507042253521,0.9154929577464789,0.028169014084507043,0.8169014084507042,0.1853233830845771,0.3059548254620123,0.2308288148721921
|
| 8 |
+
small_object_sorting,87,0.9655172413793104,0.0,0.0,0.9425287356321839,0.0,0.5747126436781609,0.26515151515151514,0.2834008097165992,0.27397260273972607
|
| 9 |
+
retail_stocking,38,0.9736842105263158,0.07894736842105263,0.0,0.9473684210526315,0.07894736842105263,0.7631578947368421,0.15384615384615385,0.1590909090909091,0.1564245810055866
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
group,samples,parsed_prediction_rate,action_exact_rate,subtask_exact_rate,transition_exact_rate,next_action_exact_rate,contact_exact_rate,object_precision,object_recall,object_f1
|
| 2 |
+
1796b943-caad-43c6-b9bd-80b8d601f37d__ep1,32,0.5625,0.0,0.0,0.5625,0.0,0.53125,0.045871559633027525,0.045871559633027525,0.045871559633027525
|
| 3 |
+
8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1,32,0.75,0.03125,0.0,0.71875,0.03125,0.4375,0.08108108108108109,0.1125,0.09424083769633508
|
| 4 |
+
33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1,32,0.84375,0.0,0.0,0.6875,0.0,0.53125,0.043859649122807015,0.06666666666666667,0.05291005291005291
|
| 5 |
+
b750fab3-7fbb-43a0-b451-c64c4d4a64da__ep1,32,0.84375,0.0,0.0,0.84375,0.0,0.375,0.2153846153846154,0.25925925925925924,0.23529411764705882
|
| 6 |
+
ba18b7c1-21ff-45da-8452-41acce7fc8de__ep2,32,0.84375,0.0,0.0,0.84375,0.0,0.75,0.3,0.2689655172413793,0.2836363636363637
|
| 7 |
+
ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2,32,0.84375,0.0625,0.0625,0.84375,0.0625,0.75,0.04830917874396135,0.16393442622950818,0.07462686567164178
|
| 8 |
+
b9dd769b-e31a-4fdb-945e-5a60db6487b0__ep2,32,0.875,0.03125,0.0,0.8125,0.03125,0.40625,0.30303030303030304,0.35398230088495575,0.32653061224489793
|
| 9 |
+
4b02bb38-384a-438a-b5f9-6131d85c34b0__ep1,32,0.875,0.09375,0.03125,0.8125,0.09375,0.40625,0.2608695652173913,0.30927835051546393,0.2830188679245283
|
| 10 |
+
9c553886-83c5-4dc4-be5c-dcb269b3a771__ep2,32,0.9375,0.0,0.0,0.9375,0.0,0.9375,0.21333333333333335,0.2831858407079646,0.24334600760456274
|
| 11 |
+
5399ef86-4df9-49bc-809f-8f4f92f9e659__ep6,32,0.9375,0.0,0.0,0.90625,0.0,0.78125,0.027777777777777776,0.027777777777777776,0.027777777777777776
|
| 12 |
+
b6579cb5-0a71-4ca6-8808-1e2700be05c7__ep3,32,0.96875,0.03125,0.0,0.9375,0.03125,0.96875,0.5130434782608696,0.4573643410852713,0.48360655737704916
|
| 13 |
+
a1012a57-385e-45a9-8a59-694a26fe92a5__ep1,32,1.0,0.0,0.0,1.0,0.0,0.90625,0.1927710843373494,0.48484848484848486,0.27586206896551724
|
| 14 |
+
877779cd-25f3-4293-a3c4-39067dd9558c__ep4,32,1.0,0.0,0.0,1.0,0.0,0.34375,0.3402061855670103,0.3548387096774194,0.3473684210526316
|
| 15 |
+
34f07a04-eb37-45a3-95ec-189ed5f4a85b__ep5,32,1.0,0.09375,0.0,1.0,0.09375,0.90625,0.18840579710144928,0.18055555555555555,0.1843971631205674
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json
ADDED
|
@@ -0,0 +1,667 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"status": "pass",
|
| 3 |
+
"source_package": "xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval",
|
| 4 |
+
"source_prediction_rows": 448,
|
| 5 |
+
"metrics_json_validity_rate": 0.875,
|
| 6 |
+
"computed": {
|
| 7 |
+
"group": "overall",
|
| 8 |
+
"samples": 448,
|
| 9 |
+
"parsed_prediction_rate": 0.8772321428571429,
|
| 10 |
+
"action_exact_rate": 0.024553571428571428,
|
| 11 |
+
"subtask_exact_rate": 0.006696428571428571,
|
| 12 |
+
"transition_exact_rate": 0.8504464285714286,
|
| 13 |
+
"next_action_exact_rate": 0.024553571428571428,
|
| 14 |
+
"contact_exact_rate": 0.6450892857142857,
|
| 15 |
+
"object_precision": 0.19611111111111112,
|
| 16 |
+
"object_recall": 0.25841874084919475,
|
| 17 |
+
"object_f1": 0.22299431459254582
|
| 18 |
+
},
|
| 19 |
+
"worst_episode_groups": [
|
| 20 |
+
{
|
| 21 |
+
"group": "1796b943-caad-43c6-b9bd-80b8d601f37d__ep1",
|
| 22 |
+
"samples": 32,
|
| 23 |
+
"parsed_prediction_rate": 0.5625,
|
| 24 |
+
"action_exact_rate": 0.0,
|
| 25 |
+
"subtask_exact_rate": 0.0,
|
| 26 |
+
"transition_exact_rate": 0.5625,
|
| 27 |
+
"next_action_exact_rate": 0.0,
|
| 28 |
+
"contact_exact_rate": 0.53125,
|
| 29 |
+
"object_precision": 0.045871559633027525,
|
| 30 |
+
"object_recall": 0.045871559633027525,
|
| 31 |
+
"object_f1": 0.045871559633027525
|
| 32 |
+
},
|
| 33 |
+
{
|
| 34 |
+
"group": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 35 |
+
"samples": 32,
|
| 36 |
+
"parsed_prediction_rate": 0.75,
|
| 37 |
+
"action_exact_rate": 0.03125,
|
| 38 |
+
"subtask_exact_rate": 0.0,
|
| 39 |
+
"transition_exact_rate": 0.71875,
|
| 40 |
+
"next_action_exact_rate": 0.03125,
|
| 41 |
+
"contact_exact_rate": 0.4375,
|
| 42 |
+
"object_precision": 0.08108108108108109,
|
| 43 |
+
"object_recall": 0.1125,
|
| 44 |
+
"object_f1": 0.09424083769633508
|
| 45 |
+
},
|
| 46 |
+
{
|
| 47 |
+
"group": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 48 |
+
"samples": 32,
|
| 49 |
+
"parsed_prediction_rate": 0.84375,
|
| 50 |
+
"action_exact_rate": 0.0,
|
| 51 |
+
"subtask_exact_rate": 0.0,
|
| 52 |
+
"transition_exact_rate": 0.6875,
|
| 53 |
+
"next_action_exact_rate": 0.0,
|
| 54 |
+
"contact_exact_rate": 0.53125,
|
| 55 |
+
"object_precision": 0.043859649122807015,
|
| 56 |
+
"object_recall": 0.06666666666666667,
|
| 57 |
+
"object_f1": 0.05291005291005291
|
| 58 |
+
},
|
| 59 |
+
{
|
| 60 |
+
"group": "b750fab3-7fbb-43a0-b451-c64c4d4a64da__ep1",
|
| 61 |
+
"samples": 32,
|
| 62 |
+
"parsed_prediction_rate": 0.84375,
|
| 63 |
+
"action_exact_rate": 0.0,
|
| 64 |
+
"subtask_exact_rate": 0.0,
|
| 65 |
+
"transition_exact_rate": 0.84375,
|
| 66 |
+
"next_action_exact_rate": 0.0,
|
| 67 |
+
"contact_exact_rate": 0.375,
|
| 68 |
+
"object_precision": 0.2153846153846154,
|
| 69 |
+
"object_recall": 0.25925925925925924,
|
| 70 |
+
"object_f1": 0.23529411764705882
|
| 71 |
+
},
|
| 72 |
+
{
|
| 73 |
+
"group": "ba18b7c1-21ff-45da-8452-41acce7fc8de__ep2",
|
| 74 |
+
"samples": 32,
|
| 75 |
+
"parsed_prediction_rate": 0.84375,
|
| 76 |
+
"action_exact_rate": 0.0,
|
| 77 |
+
"subtask_exact_rate": 0.0,
|
| 78 |
+
"transition_exact_rate": 0.84375,
|
| 79 |
+
"next_action_exact_rate": 0.0,
|
| 80 |
+
"contact_exact_rate": 0.75,
|
| 81 |
+
"object_precision": 0.3,
|
| 82 |
+
"object_recall": 0.2689655172413793,
|
| 83 |
+
"object_f1": 0.2836363636363637
|
| 84 |
+
},
|
| 85 |
+
{
|
| 86 |
+
"group": "ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2",
|
| 87 |
+
"samples": 32,
|
| 88 |
+
"parsed_prediction_rate": 0.84375,
|
| 89 |
+
"action_exact_rate": 0.0625,
|
| 90 |
+
"subtask_exact_rate": 0.0625,
|
| 91 |
+
"transition_exact_rate": 0.84375,
|
| 92 |
+
"next_action_exact_rate": 0.0625,
|
| 93 |
+
"contact_exact_rate": 0.75,
|
| 94 |
+
"object_precision": 0.04830917874396135,
|
| 95 |
+
"object_recall": 0.16393442622950818,
|
| 96 |
+
"object_f1": 0.07462686567164178
|
| 97 |
+
},
|
| 98 |
+
{
|
| 99 |
+
"group": "b9dd769b-e31a-4fdb-945e-5a60db6487b0__ep2",
|
| 100 |
+
"samples": 32,
|
| 101 |
+
"parsed_prediction_rate": 0.875,
|
| 102 |
+
"action_exact_rate": 0.03125,
|
| 103 |
+
"subtask_exact_rate": 0.0,
|
| 104 |
+
"transition_exact_rate": 0.8125,
|
| 105 |
+
"next_action_exact_rate": 0.03125,
|
| 106 |
+
"contact_exact_rate": 0.40625,
|
| 107 |
+
"object_precision": 0.30303030303030304,
|
| 108 |
+
"object_recall": 0.35398230088495575,
|
| 109 |
+
"object_f1": 0.32653061224489793
|
| 110 |
+
},
|
| 111 |
+
{
|
| 112 |
+
"group": "4b02bb38-384a-438a-b5f9-6131d85c34b0__ep1",
|
| 113 |
+
"samples": 32,
|
| 114 |
+
"parsed_prediction_rate": 0.875,
|
| 115 |
+
"action_exact_rate": 0.09375,
|
| 116 |
+
"subtask_exact_rate": 0.03125,
|
| 117 |
+
"transition_exact_rate": 0.8125,
|
| 118 |
+
"next_action_exact_rate": 0.09375,
|
| 119 |
+
"contact_exact_rate": 0.40625,
|
| 120 |
+
"object_precision": 0.2608695652173913,
|
| 121 |
+
"object_recall": 0.30927835051546393,
|
| 122 |
+
"object_f1": 0.2830188679245283
|
| 123 |
+
}
|
| 124 |
+
],
|
| 125 |
+
"action_family_groups": [
|
| 126 |
+
{
|
| 127 |
+
"group": "locomotion",
|
| 128 |
+
"samples": 23,
|
| 129 |
+
"parsed_prediction_rate": 0.2608695652173913,
|
| 130 |
+
"action_exact_rate": 0.0,
|
| 131 |
+
"subtask_exact_rate": 0.0,
|
| 132 |
+
"transition_exact_rate": 0.2608695652173913,
|
| 133 |
+
"next_action_exact_rate": 0.0,
|
| 134 |
+
"contact_exact_rate": 0.08695652173913043,
|
| 135 |
+
"object_precision": 0.010752688172043012,
|
| 136 |
+
"object_recall": 0.0136986301369863,
|
| 137 |
+
"object_f1": 0.012048192771084338
|
| 138 |
+
},
|
| 139 |
+
{
|
| 140 |
+
"group": "food_kitchen",
|
| 141 |
+
"samples": 5,
|
| 142 |
+
"parsed_prediction_rate": 0.6,
|
| 143 |
+
"action_exact_rate": 0.2,
|
| 144 |
+
"subtask_exact_rate": 0.0,
|
| 145 |
+
"transition_exact_rate": 0.6,
|
| 146 |
+
"next_action_exact_rate": 0.2,
|
| 147 |
+
"contact_exact_rate": 0.2,
|
| 148 |
+
"object_precision": 0.375,
|
| 149 |
+
"object_recall": 0.21428571428571427,
|
| 150 |
+
"object_f1": 0.2727272727272727
|
| 151 |
+
},
|
| 152 |
+
{
|
| 153 |
+
"group": "cleaning",
|
| 154 |
+
"samples": 8,
|
| 155 |
+
"parsed_prediction_rate": 0.75,
|
| 156 |
+
"action_exact_rate": 0.0,
|
| 157 |
+
"subtask_exact_rate": 0.0,
|
| 158 |
+
"transition_exact_rate": 0.625,
|
| 159 |
+
"next_action_exact_rate": 0.0,
|
| 160 |
+
"contact_exact_rate": 0.625,
|
| 161 |
+
"object_precision": 0.0,
|
| 162 |
+
"object_recall": 0.0,
|
| 163 |
+
"object_f1": 0.0
|
| 164 |
+
},
|
| 165 |
+
{
|
| 166 |
+
"group": "other",
|
| 167 |
+
"samples": 94,
|
| 168 |
+
"parsed_prediction_rate": 0.851063829787234,
|
| 169 |
+
"action_exact_rate": 0.0,
|
| 170 |
+
"subtask_exact_rate": 0.0,
|
| 171 |
+
"transition_exact_rate": 0.8085106382978723,
|
| 172 |
+
"next_action_exact_rate": 0.0,
|
| 173 |
+
"contact_exact_rate": 0.6063829787234043,
|
| 174 |
+
"object_precision": 0.17220543806646527,
|
| 175 |
+
"object_recall": 0.21428571428571427,
|
| 176 |
+
"object_f1": 0.19095477386934673
|
| 177 |
+
},
|
| 178 |
+
{
|
| 179 |
+
"group": "phone_use",
|
| 180 |
+
"samples": 51,
|
| 181 |
+
"parsed_prediction_rate": 0.9019607843137255,
|
| 182 |
+
"action_exact_rate": 0.058823529411764705,
|
| 183 |
+
"subtask_exact_rate": 0.0196078431372549,
|
| 184 |
+
"transition_exact_rate": 0.8431372549019608,
|
| 185 |
+
"next_action_exact_rate": 0.058823529411764705,
|
| 186 |
+
"contact_exact_rate": 0.5686274509803921,
|
| 187 |
+
"object_precision": 0.35542168674698793,
|
| 188 |
+
"object_recall": 0.34502923976608185,
|
| 189 |
+
"object_f1": 0.3501483679525222
|
| 190 |
+
},
|
| 191 |
+
{
|
| 192 |
+
"group": "paper_cardboard_craft",
|
| 193 |
+
"samples": 142,
|
| 194 |
+
"parsed_prediction_rate": 0.9225352112676056,
|
| 195 |
+
"action_exact_rate": 0.028169014084507043,
|
| 196 |
+
"subtask_exact_rate": 0.014084507042253521,
|
| 197 |
+
"transition_exact_rate": 0.9154929577464789,
|
| 198 |
+
"next_action_exact_rate": 0.028169014084507043,
|
| 199 |
+
"contact_exact_rate": 0.8169014084507042,
|
| 200 |
+
"object_precision": 0.1853233830845771,
|
| 201 |
+
"object_recall": 0.3059548254620123,
|
| 202 |
+
"object_f1": 0.2308288148721921
|
| 203 |
+
},
|
| 204 |
+
{
|
| 205 |
+
"group": "small_object_sorting",
|
| 206 |
+
"samples": 87,
|
| 207 |
+
"parsed_prediction_rate": 0.9655172413793104,
|
| 208 |
+
"action_exact_rate": 0.0,
|
| 209 |
+
"subtask_exact_rate": 0.0,
|
| 210 |
+
"transition_exact_rate": 0.9425287356321839,
|
| 211 |
+
"next_action_exact_rate": 0.0,
|
| 212 |
+
"contact_exact_rate": 0.5747126436781609,
|
| 213 |
+
"object_precision": 0.26515151515151514,
|
| 214 |
+
"object_recall": 0.2834008097165992,
|
| 215 |
+
"object_f1": 0.27397260273972607
|
| 216 |
+
},
|
| 217 |
+
{
|
| 218 |
+
"group": "retail_stocking",
|
| 219 |
+
"samples": 38,
|
| 220 |
+
"parsed_prediction_rate": 0.9736842105263158,
|
| 221 |
+
"action_exact_rate": 0.07894736842105263,
|
| 222 |
+
"subtask_exact_rate": 0.0,
|
| 223 |
+
"transition_exact_rate": 0.9473684210526315,
|
| 224 |
+
"next_action_exact_rate": 0.07894736842105263,
|
| 225 |
+
"contact_exact_rate": 0.7631578947368421,
|
| 226 |
+
"object_precision": 0.15384615384615385,
|
| 227 |
+
"object_recall": 0.1590909090909091,
|
| 228 |
+
"object_f1": 0.1564245810055866
|
| 229 |
+
}
|
| 230 |
+
],
|
| 231 |
+
"train_seen_groups": [
|
| 232 |
+
{
|
| 233 |
+
"group": "unseen_in_train",
|
| 234 |
+
"samples": 317,
|
| 235 |
+
"parsed_prediction_rate": 0.8454258675078864,
|
| 236 |
+
"action_exact_rate": 0.015772870662460567,
|
| 237 |
+
"subtask_exact_rate": 0.006309148264984227,
|
| 238 |
+
"transition_exact_rate": 0.8233438485804416,
|
| 239 |
+
"next_action_exact_rate": 0.015772870662460567,
|
| 240 |
+
"contact_exact_rate": 0.6151419558359621,
|
| 241 |
+
"object_precision": 0.15804806991988346,
|
| 242 |
+
"object_recall": 0.23183760683760685,
|
| 243 |
+
"object_f1": 0.18796015591165008
|
| 244 |
+
},
|
| 245 |
+
{
|
| 246 |
+
"group": "seen_in_train",
|
| 247 |
+
"samples": 131,
|
| 248 |
+
"parsed_prediction_rate": 0.9541984732824428,
|
| 249 |
+
"action_exact_rate": 0.04580152671755725,
|
| 250 |
+
"subtask_exact_rate": 0.007633587786259542,
|
| 251 |
+
"transition_exact_rate": 0.916030534351145,
|
| 252 |
+
"next_action_exact_rate": 0.04580152671755725,
|
| 253 |
+
"contact_exact_rate": 0.7175572519083969,
|
| 254 |
+
"object_precision": 0.3185011709601874,
|
| 255 |
+
"object_recall": 0.31627906976744186,
|
| 256 |
+
"object_f1": 0.3173862310385064
|
| 257 |
+
}
|
| 258 |
+
],
|
| 259 |
+
"missing_modality_groups": [
|
| 260 |
+
{
|
| 261 |
+
"group": "rrd_missing_only_required_modalities_present",
|
| 262 |
+
"samples": 448,
|
| 263 |
+
"parsed_prediction_rate": 0.8772321428571429,
|
| 264 |
+
"action_exact_rate": 0.024553571428571428,
|
| 265 |
+
"subtask_exact_rate": 0.006696428571428571,
|
| 266 |
+
"transition_exact_rate": 0.8504464285714286,
|
| 267 |
+
"next_action_exact_rate": 0.024553571428571428,
|
| 268 |
+
"contact_exact_rate": 0.6450892857142857,
|
| 269 |
+
"object_precision": 0.19611111111111112,
|
| 270 |
+
"object_recall": 0.25841874084919475,
|
| 271 |
+
"object_f1": 0.22299431459254582
|
| 272 |
+
}
|
| 273 |
+
],
|
| 274 |
+
"object_category_groups": [
|
| 275 |
+
{
|
| 276 |
+
"group": "furniture_room",
|
| 277 |
+
"samples": 96,
|
| 278 |
+
"parsed_prediction_rate": 0.71875,
|
| 279 |
+
"action_exact_rate": 0.0,
|
| 280 |
+
"subtask_exact_rate": 0.0,
|
| 281 |
+
"transition_exact_rate": 0.7083333333333334,
|
| 282 |
+
"next_action_exact_rate": 0.0,
|
| 283 |
+
"contact_exact_rate": 0.4166666666666667,
|
| 284 |
+
"object_precision": 0.2534246575342466,
|
| 285 |
+
"object_recall": 0.2334384858044164,
|
| 286 |
+
"object_f1": 0.24302134646962234
|
| 287 |
+
},
|
| 288 |
+
{
|
| 289 |
+
"group": "other_object",
|
| 290 |
+
"samples": 135,
|
| 291 |
+
"parsed_prediction_rate": 0.7925925925925926,
|
| 292 |
+
"action_exact_rate": 0.02962962962962963,
|
| 293 |
+
"subtask_exact_rate": 0.007407407407407408,
|
| 294 |
+
"transition_exact_rate": 0.762962962962963,
|
| 295 |
+
"next_action_exact_rate": 0.02962962962962963,
|
| 296 |
+
"contact_exact_rate": 0.6,
|
| 297 |
+
"object_precision": 0.13717693836978131,
|
| 298 |
+
"object_recall": 0.16428571428571428,
|
| 299 |
+
"object_f1": 0.1495124593716143
|
| 300 |
+
},
|
| 301 |
+
{
|
| 302 |
+
"group": "food_kitchen",
|
| 303 |
+
"samples": 56,
|
| 304 |
+
"parsed_prediction_rate": 0.8571428571428571,
|
| 305 |
+
"action_exact_rate": 0.0,
|
| 306 |
+
"subtask_exact_rate": 0.0,
|
| 307 |
+
"transition_exact_rate": 0.8214285714285714,
|
| 308 |
+
"next_action_exact_rate": 0.0,
|
| 309 |
+
"contact_exact_rate": 0.7678571428571429,
|
| 310 |
+
"object_precision": 0.22277227722772278,
|
| 311 |
+
"object_recall": 0.2,
|
| 312 |
+
"object_f1": 0.2107728337236534
|
| 313 |
+
},
|
| 314 |
+
{
|
| 315 |
+
"group": "cleaning",
|
| 316 |
+
"samples": 8,
|
| 317 |
+
"parsed_prediction_rate": 0.875,
|
| 318 |
+
"action_exact_rate": 0.0,
|
| 319 |
+
"subtask_exact_rate": 0.0,
|
| 320 |
+
"transition_exact_rate": 0.875,
|
| 321 |
+
"next_action_exact_rate": 0.0,
|
| 322 |
+
"contact_exact_rate": 0.625,
|
| 323 |
+
"object_precision": 0.04,
|
| 324 |
+
"object_recall": 0.047619047619047616,
|
| 325 |
+
"object_f1": 0.043478260869565216
|
| 326 |
+
},
|
| 327 |
+
{
|
| 328 |
+
"group": "phone_device",
|
| 329 |
+
"samples": 162,
|
| 330 |
+
"parsed_prediction_rate": 0.9074074074074074,
|
| 331 |
+
"action_exact_rate": 0.024691358024691357,
|
| 332 |
+
"subtask_exact_rate": 0.006172839506172839,
|
| 333 |
+
"transition_exact_rate": 0.8703703703703703,
|
| 334 |
+
"next_action_exact_rate": 0.024691358024691357,
|
| 335 |
+
"contact_exact_rate": 0.5864197530864198,
|
| 336 |
+
"object_precision": 0.32521739130434785,
|
| 337 |
+
"object_recall": 0.3132328308207705,
|
| 338 |
+
"object_f1": 0.31911262798634815
|
| 339 |
+
},
|
| 340 |
+
{
|
| 341 |
+
"group": "paper_cardboard",
|
| 342 |
+
"samples": 261,
|
| 343 |
+
"parsed_prediction_rate": 0.9080459770114943,
|
| 344 |
+
"action_exact_rate": 0.034482758620689655,
|
| 345 |
+
"subtask_exact_rate": 0.011494252873563218,
|
| 346 |
+
"transition_exact_rate": 0.8888888888888888,
|
| 347 |
+
"next_action_exact_rate": 0.034482758620689655,
|
| 348 |
+
"contact_exact_rate": 0.7203065134099617,
|
| 349 |
+
"object_precision": 0.22274881516587677,
|
| 350 |
+
"object_recall": 0.32339449541284404,
|
| 351 |
+
"object_f1": 0.2637979420018709
|
| 352 |
+
},
|
| 353 |
+
{
|
| 354 |
+
"group": "craft_small_object",
|
| 355 |
+
"samples": 106,
|
| 356 |
+
"parsed_prediction_rate": 0.9339622641509434,
|
| 357 |
+
"action_exact_rate": 0.02830188679245283,
|
| 358 |
+
"subtask_exact_rate": 0.009433962264150943,
|
| 359 |
+
"transition_exact_rate": 0.9150943396226415,
|
| 360 |
+
"next_action_exact_rate": 0.02830188679245283,
|
| 361 |
+
"contact_exact_rate": 0.5,
|
| 362 |
+
"object_precision": 0.22662889518413598,
|
| 363 |
+
"object_recall": 0.25806451612903225,
|
| 364 |
+
"object_f1": 0.24132730015082954
|
| 365 |
+
},
|
| 366 |
+
{
|
| 367 |
+
"group": "retail_container",
|
| 368 |
+
"samples": 101,
|
| 369 |
+
"parsed_prediction_rate": 0.9405940594059405,
|
| 370 |
+
"action_exact_rate": 0.0297029702970297,
|
| 371 |
+
"subtask_exact_rate": 0.0,
|
| 372 |
+
"transition_exact_rate": 0.9108910891089109,
|
| 373 |
+
"next_action_exact_rate": 0.0297029702970297,
|
| 374 |
+
"contact_exact_rate": 0.7722772277227723,
|
| 375 |
+
"object_precision": 0.20279720279720279,
|
| 376 |
+
"object_recall": 0.17522658610271905,
|
| 377 |
+
"object_f1": 0.18800648298217182
|
| 378 |
+
},
|
| 379 |
+
{
|
| 380 |
+
"group": "tool_stationery",
|
| 381 |
+
"samples": 138,
|
| 382 |
+
"parsed_prediction_rate": 0.9565217391304348,
|
| 383 |
+
"action_exact_rate": 0.014492753623188406,
|
| 384 |
+
"subtask_exact_rate": 0.0,
|
| 385 |
+
"transition_exact_rate": 0.9347826086956522,
|
| 386 |
+
"next_action_exact_rate": 0.014492753623188406,
|
| 387 |
+
"contact_exact_rate": 0.8043478260869565,
|
| 388 |
+
"object_precision": 0.27906976744186046,
|
| 389 |
+
"object_recall": 0.3894523326572008,
|
| 390 |
+
"object_f1": 0.32514817950889074
|
| 391 |
+
},
|
| 392 |
+
{
|
| 393 |
+
"group": "no_object_label",
|
| 394 |
+
"samples": 2,
|
| 395 |
+
"parsed_prediction_rate": 1.0,
|
| 396 |
+
"action_exact_rate": 0.0,
|
| 397 |
+
"subtask_exact_rate": 0.0,
|
| 398 |
+
"transition_exact_rate": 1.0,
|
| 399 |
+
"next_action_exact_rate": 0.0,
|
| 400 |
+
"contact_exact_rate": 1.0,
|
| 401 |
+
"object_precision": 0.0,
|
| 402 |
+
"object_recall": 0.0,
|
| 403 |
+
"object_f1": 0.0
|
| 404 |
+
}
|
| 405 |
+
],
|
| 406 |
+
"invalid_json_examples": [
|
| 407 |
+
{
|
| 408 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:0",
|
| 409 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 410 |
+
"true_action": "Hold smartphone",
|
| 411 |
+
"raw_prediction_prefix": "{\"action\": \"Pour liquid into bowl\", \"contact\": \"yes\", \"evidence_window\": {\"end_frame\": 19, \"start_frame\": 0}, \"next_action\": \"Pour liquid into bowl\", \"objects\": [\"bottle\", \"bowl\", \"kitchen counter\", \"kitchen sink\", \"kitchen tap\", \"kitchen c"
|
| 412 |
+
},
|
| 413 |
+
{
|
| 414 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:90",
|
| 415 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 416 |
+
"true_action": "Move towards the stove",
|
| 417 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 1819, \"start_frame\": 1800}, \"next_action\": \"Walk through workspace\", \"objects\": [\"kitchen counter\", \"kitchen sink\", \"kitchen cabinets\", \"kitchen applianc"
|
| 418 |
+
},
|
| 419 |
+
{
|
| 420 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:100",
|
| 421 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 422 |
+
"true_action": "Open stove pot lid",
|
| 423 |
+
"raw_prediction_prefix": "{\"action\": \"Approach packing area\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 2019, \"start_frame\": 2000}, \"next_action\": \"Approach packing area\", \"objects\": [\"kitchen counter\", \"kitchen sink\", \"kitchen cabinet\", \"kitchen utensils\", "
|
| 424 |
+
},
|
| 425 |
+
{
|
| 426 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:126",
|
| 427 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 428 |
+
"true_action": "Closing the door",
|
| 429 |
+
"raw_prediction_prefix": "{\"action\": \"Close door\", \"contact\": \"yes\", \"evidence_window\": {\"end_frame\": 2539, \"start_frame\": 2520}, \"next_action\": \"Close door\", \"objects\": [\"door\", \"door handle\", \"plastic bag\", \"blue bucket\", \"red bucket\", \"blue container\", \"red conta"
|
| 430 |
+
},
|
| 431 |
+
{
|
| 432 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:127",
|
| 433 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 434 |
+
"true_action": "Closing the door",
|
| 435 |
+
"raw_prediction_prefix": "{\"action\": \"Close the door\", \"contact\": \"yes\", \"evidence_window\": {\"end_frame\": 2559, \"start_frame\": 2540}, \"next_action\": \"Close the door\", \"objects\": [\"door\", \"door handle\", \"door knob\", \"door frame\", \"door hinge\", \"door latch\", \"door loc"
|
| 436 |
+
},
|
| 437 |
+
{
|
| 438 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:134",
|
| 439 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 440 |
+
"true_action": "Picking up bottle",
|
| 441 |
+
"raw_prediction_prefix": "{\"action\": \"Carry cereal boxes\", \"contact\": \"yes\", \"evidence_window\": {\"end_frame\": 2699, \"start_frame\": 2680}, \"next_action\": \"Carry cereal boxes\", \"objects\": [\"cereal boxes\", \"door\", \"washing machine\", \"laundry basket\", \"laundry detergent"
|
| 442 |
+
},
|
| 443 |
+
{
|
| 444 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:140",
|
| 445 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 446 |
+
"true_action": "Wipe kitchen counter",
|
| 447 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 2819, \"start_frame\": 2800}, \"next_action\": \"Walk through workspace\", \"objects\": [\"laundry machine\", \"laundry basket\", \"laundry detergent\", \"laundry soap\""
|
| 448 |
+
},
|
| 449 |
+
{
|
| 450 |
+
"id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1:qa:170",
|
| 451 |
+
"episode_id": "8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1",
|
| 452 |
+
"true_action": "Reach for cleaning supplies",
|
| 453 |
+
"raw_prediction_prefix": "{\"action\": \"Close cabinet door\", \"contact\": \"yes\", \"evidence_window\": {\"end_frame\": 3419, \"start_frame\": 3400}, \"next_action\": \"Close cabinet door\", \"objects\": [\"cabinet\", \"cabinet door\", \"cabinet handle\", \"cabinet shelf\", \"cabinet drawer\","
|
| 454 |
+
},
|
| 455 |
+
{
|
| 456 |
+
"id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1:qa:0",
|
| 457 |
+
"episode_id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 458 |
+
"true_action": "Observe and walk through store",
|
| 459 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 19, \"start_frame\": 0}, \"next_action\": \"Walk through workspace\", \"objects\": [\"person\", \"mirror\", \"desk\", \"chair\", \"box\", \"shelf\", \"door\", \"wall\", \"floor\","
|
| 460 |
+
},
|
| 461 |
+
{
|
| 462 |
+
"id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1:qa:1",
|
| 463 |
+
"episode_id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 464 |
+
"true_action": "Observe and walk through store",
|
| 465 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 39, \"start_frame\": 20}, \"next_action\": \"Walk through workspace\", \"objects\": [\"person\", \"desk\", \"chair\", \"wall\", \"ceiling\", \"floor\", \"box\", \"cardboard\", \""
|
| 466 |
+
},
|
| 467 |
+
{
|
| 468 |
+
"id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1:qa:50",
|
| 469 |
+
"episode_id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 470 |
+
"true_action": "Walk towards shelves",
|
| 471 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 1019, \"start_frame\": 1000}, \"next_action\": \"Walk through workspace\", \"objects\": [\"person\", \"cardboard\", \"shelf\", \"door\", \"box\", \"jar\", \"lantern\", \"light\""
|
| 472 |
+
},
|
| 473 |
+
{
|
| 474 |
+
"id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1:qa:59",
|
| 475 |
+
"episode_id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 476 |
+
"true_action": "Observe workspace",
|
| 477 |
+
"raw_prediction_prefix": "{\"action\": \"Walk through workspace\", \"contact\": \"no\", \"evidence_window\": {\"end_frame\": 1199, \"start_frame\": 1180}, \"next_action\": \"Walk through workspace\", \"objects\": [\"cardboard\", \"cardboard box\", \"cardboard pieces\", \"cardboard sheet\", \"ca"
|
| 478 |
+
}
|
| 479 |
+
],
|
| 480 |
+
"object_overgeneration_examples": [
|
| 481 |
+
{
|
| 482 |
+
"id": "a1012a57-385e-45a9-8a59-694a26fe92a5__ep1:qa:19",
|
| 483 |
+
"episode_id": "a1012a57-385e-45a9-8a59-694a26fe92a5__ep1",
|
| 484 |
+
"true_action": "Start cutting",
|
| 485 |
+
"predicted_object_count": 175,
|
| 486 |
+
"first_predicted_objects": [
|
| 487 |
+
"cardboard",
|
| 488 |
+
"cardboard box",
|
| 489 |
+
"cardboard pieces",
|
| 490 |
+
"cardboard sheet",
|
| 491 |
+
"cardboard square",
|
| 492 |
+
"cardboard tray",
|
| 493 |
+
"cardboard tube",
|
| 494 |
+
"utility knife",
|
| 495 |
+
"scissors",
|
| 496 |
+
"ruler",
|
| 497 |
+
"pen",
|
| 498 |
+
"marker",
|
| 499 |
+
"box",
|
| 500 |
+
"container",
|
| 501 |
+
"plastic container",
|
| 502 |
+
"tin can",
|
| 503 |
+
"jar",
|
| 504 |
+
"canned food",
|
| 505 |
+
"canned goods",
|
| 506 |
+
"canned product"
|
| 507 |
+
]
|
| 508 |
+
},
|
| 509 |
+
{
|
| 510 |
+
"id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1:qa:70",
|
| 511 |
+
"episode_id": "33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1",
|
| 512 |
+
"true_action": "Reach for wire hangers",
|
| 513 |
+
"predicted_object_count": 53,
|
| 514 |
+
"first_predicted_objects": [
|
| 515 |
+
"cardboard",
|
| 516 |
+
"cardboard box",
|
| 517 |
+
"cardboard pieces",
|
| 518 |
+
"cardboard shapes",
|
| 519 |
+
"cardboard squares",
|
| 520 |
+
"cardboard tray",
|
| 521 |
+
"cardboard tube",
|
| 522 |
+
"cardboard pieces",
|
| 523 |
+
"cardboard shapes",
|
| 524 |
+
"cardboard squares",
|
| 525 |
+
"cardboard tray",
|
| 526 |
+
"cardboard tube",
|
| 527 |
+
"blue foam pieces",
|
| 528 |
+
"blue foam sheet",
|
| 529 |
+
"blue product box",
|
| 530 |
+
"blue strip",
|
| 531 |
+
"canned food",
|
| 532 |
+
"canned goods",
|
| 533 |
+
"canned items",
|
| 534 |
+
"cans"
|
| 535 |
+
]
|
| 536 |
+
},
|
| 537 |
+
{
|
| 538 |
+
"id": "ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2:qa:30",
|
| 539 |
+
"episode_id": "ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2",
|
| 540 |
+
"true_action": "Grasp lantern",
|
| 541 |
+
"predicted_object_count": 119,
|
| 542 |
+
"first_predicted_objects": [
|
| 543 |
+
"jar",
|
| 544 |
+
"red bowl",
|
| 545 |
+
"cardboard box",
|
| 546 |
+
"white paper",
|
| 547 |
+
"black bag",
|
| 548 |
+
"white bag",
|
| 549 |
+
"plastic bag",
|
| 550 |
+
"cardboard pieces",
|
| 551 |
+
"cardboard tray",
|
| 552 |
+
"cardboard sheet",
|
| 553 |
+
"cardboard shape",
|
| 554 |
+
"cardboard tube",
|
| 555 |
+
"cardboard strip",
|
| 556 |
+
"cardboard pattern",
|
| 557 |
+
"cardboard cutout",
|
| 558 |
+
"cardboard square",
|
| 559 |
+
"cardboard stack",
|
| 560 |
+
"plastic container",
|
| 561 |
+
"canned food",
|
| 562 |
+
"tin can"
|
| 563 |
+
]
|
| 564 |
+
},
|
| 565 |
+
{
|
| 566 |
+
"id": "ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2:qa:176",
|
| 567 |
+
"episode_id": "ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2",
|
| 568 |
+
"true_action": "Release lantern",
|
| 569 |
+
"predicted_object_count": 205,
|
| 570 |
+
"first_predicted_objects": [
|
| 571 |
+
"jar",
|
| 572 |
+
"gift box",
|
| 573 |
+
"cardboard",
|
| 574 |
+
"paper lantern",
|
| 575 |
+
"plastic bag",
|
| 576 |
+
"plastic container",
|
| 577 |
+
"shopping bag",
|
| 578 |
+
"cardboard box",
|
| 579 |
+
"cardboard piece",
|
| 580 |
+
"cardboard tray",
|
| 581 |
+
"cardboard sheet",
|
| 582 |
+
"cardboard shape",
|
| 583 |
+
"cardboard pattern",
|
| 584 |
+
"cardboard square",
|
| 585 |
+
"cardboard strip",
|
| 586 |
+
"cardboard tube",
|
| 587 |
+
"cardboard piece",
|
| 588 |
+
"cardboard cutout",
|
| 589 |
+
"cardboard pattern piece",
|
| 590 |
+
"box"
|
| 591 |
+
]
|
| 592 |
+
},
|
| 593 |
+
{
|
| 594 |
+
"id": "1796b943-caad-43c6-b9bd-80b8d601f37d__ep1:qa:40",
|
| 595 |
+
"episode_id": "1796b943-caad-43c6-b9bd-80b8d601f37d__ep1",
|
| 596 |
+
"true_action": "Move through the training room",
|
| 597 |
+
"predicted_object_count": 108,
|
| 598 |
+
"first_predicted_objects": [
|
| 599 |
+
"people",
|
| 600 |
+
"office chairs",
|
| 601 |
+
"desk",
|
| 602 |
+
"computer",
|
| 603 |
+
"laptop",
|
| 604 |
+
"office supplies",
|
| 605 |
+
"whiteboard",
|
| 606 |
+
"door",
|
| 607 |
+
"window",
|
| 608 |
+
"light fixture",
|
| 609 |
+
"wall",
|
| 610 |
+
"floor",
|
| 611 |
+
"box",
|
| 612 |
+
"cardboard",
|
| 613 |
+
"paper",
|
| 614 |
+
"plastic container",
|
| 615 |
+
"jar",
|
| 616 |
+
"bottle",
|
| 617 |
+
"canned food",
|
| 618 |
+
"snack package"
|
| 619 |
+
]
|
| 620 |
+
}
|
| 621 |
+
],
|
| 622 |
+
"modality_missing_by_episode": {
|
| 623 |
+
"8a8e1b3c-607e-4ada-b3fd-fa639727e92c__ep1": [
|
| 624 |
+
"visualization.rrd"
|
| 625 |
+
],
|
| 626 |
+
"a1012a57-385e-45a9-8a59-694a26fe92a5__ep1": [
|
| 627 |
+
"visualization.rrd"
|
| 628 |
+
],
|
| 629 |
+
"33f7ae08-ac1d-4321-9cb9-eca79016b359__ep1": [
|
| 630 |
+
"visualization.rrd"
|
| 631 |
+
],
|
| 632 |
+
"9c553886-83c5-4dc4-be5c-dcb269b3a771__ep2": [
|
| 633 |
+
"visualization.rrd"
|
| 634 |
+
],
|
| 635 |
+
"34f07a04-eb37-45a3-95ec-189ed5f4a85b__ep5": [
|
| 636 |
+
"visualization.rrd"
|
| 637 |
+
],
|
| 638 |
+
"b9dd769b-e31a-4fdb-945e-5a60db6487b0__ep2": [
|
| 639 |
+
"visualization.rrd"
|
| 640 |
+
],
|
| 641 |
+
"ba045ed4-ef25-404d-b756-8dcbd45b18fa__ep2": [
|
| 642 |
+
"visualization.rrd"
|
| 643 |
+
],
|
| 644 |
+
"4b02bb38-384a-438a-b5f9-6131d85c34b0__ep1": [
|
| 645 |
+
"visualization.rrd"
|
| 646 |
+
],
|
| 647 |
+
"5399ef86-4df9-49bc-809f-8f4f92f9e659__ep6": [
|
| 648 |
+
"visualization.rrd"
|
| 649 |
+
],
|
| 650 |
+
"b750fab3-7fbb-43a0-b451-c64c4d4a64da__ep1": [
|
| 651 |
+
"visualization.rrd"
|
| 652 |
+
],
|
| 653 |
+
"877779cd-25f3-4293-a3c4-39067dd9558c__ep4": [
|
| 654 |
+
"visualization.rrd"
|
| 655 |
+
],
|
| 656 |
+
"1796b943-caad-43c6-b9bd-80b8d601f37d__ep1": [
|
| 657 |
+
"visualization.rrd"
|
| 658 |
+
],
|
| 659 |
+
"ba18b7c1-21ff-45da-8452-41acce7fc8de__ep2": [
|
| 660 |
+
"visualization.rrd"
|
| 661 |
+
],
|
| 662 |
+
"b6579cb5-0a71-4ca6-8808-1e2700be05c7__ep3": [
|
| 663 |
+
"visualization.rrd"
|
| 664 |
+
]
|
| 665 |
+
},
|
| 666 |
+
"interpretation": "The diagnostic pilot is dominated by invalid or weak structured outputs and exact-label failures. These tables identify where to tighten JSON constraints, action/subtask target formatting, object vocabularies, and missing-modality robustness before claiming stronger model quality."
|
| 667 |
+
}
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
group,samples,parsed_prediction_rate,action_exact_rate,subtask_exact_rate,transition_exact_rate,next_action_exact_rate,contact_exact_rate,object_precision,object_recall,object_f1
|
| 2 |
+
rrd_missing_only_required_modalities_present,448,0.8772321428571429,0.024553571428571428,0.006696428571428571,0.8504464285714286,0.024553571428571428,0.6450892857142857,0.19611111111111112,0.25841874084919475,0.22299431459254582
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
group,samples,parsed_prediction_rate,action_exact_rate,subtask_exact_rate,transition_exact_rate,next_action_exact_rate,contact_exact_rate,object_precision,object_recall,object_f1
|
| 2 |
+
furniture_room,96,0.71875,0.0,0.0,0.7083333333333334,0.0,0.4166666666666667,0.2534246575342466,0.2334384858044164,0.24302134646962234
|
| 3 |
+
other_object,135,0.7925925925925926,0.02962962962962963,0.007407407407407408,0.762962962962963,0.02962962962962963,0.6,0.13717693836978131,0.16428571428571428,0.1495124593716143
|
| 4 |
+
food_kitchen,56,0.8571428571428571,0.0,0.0,0.8214285714285714,0.0,0.7678571428571429,0.22277227722772278,0.2,0.2107728337236534
|
| 5 |
+
cleaning,8,0.875,0.0,0.0,0.875,0.0,0.625,0.04,0.047619047619047616,0.043478260869565216
|
| 6 |
+
phone_device,162,0.9074074074074074,0.024691358024691357,0.006172839506172839,0.8703703703703703,0.024691358024691357,0.5864197530864198,0.32521739130434785,0.3132328308207705,0.31911262798634815
|
| 7 |
+
paper_cardboard,261,0.9080459770114943,0.034482758620689655,0.011494252873563218,0.8888888888888888,0.034482758620689655,0.7203065134099617,0.22274881516587677,0.32339449541284404,0.2637979420018709
|
| 8 |
+
craft_small_object,106,0.9339622641509434,0.02830188679245283,0.009433962264150943,0.9150943396226415,0.02830188679245283,0.5,0.22662889518413598,0.25806451612903225,0.24132730015082954
|
| 9 |
+
retail_container,101,0.9405940594059405,0.0297029702970297,0.0,0.9108910891089109,0.0297029702970297,0.7722772277227723,0.20279720279720279,0.17522658610271905,0.18800648298217182
|
| 10 |
+
tool_stationery,138,0.9565217391304348,0.014492753623188406,0.0,0.9347826086956522,0.014492753623188406,0.8043478260869565,0.27906976744186046,0.3894523326572008,0.32514817950889074
|
| 11 |
+
no_object_label,2,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0
|
results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
group,samples,parsed_prediction_rate,action_exact_rate,subtask_exact_rate,transition_exact_rate,next_action_exact_rate,contact_exact_rate,object_precision,object_recall,object_f1
|
| 2 |
+
unseen_in_train,317,0.8454258675078864,0.015772870662460567,0.006309148264984227,0.8233438485804416,0.015772870662460567,0.6151419558359621,0.15804806991988346,0.23183760683760685,0.18796015591165008
|
| 3 |
+
seen_in_train,131,0.9541984732824428,0.04580152671755725,0.007633587786259542,0.916030534351145,0.04580152671755725,0.7175572519083969,0.3185011709601874,0.31627906976744186,0.3173862310385064
|
scripts/build_artifact_index.py
CHANGED
|
@@ -129,6 +129,14 @@ ARTIFACTS = [
|
|
| 129 |
"surface": "repo_hf",
|
| 130 |
"shows": "Builds synthetic verified packages for every configured backbone and audits them against the public-safe package contract.",
|
| 131 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
{
|
| 133 |
"id": "additional_development_directions",
|
| 134 |
"title": "Additional development directions",
|
|
@@ -674,6 +682,22 @@ ARTIFACTS = [
|
|
| 674 |
"surface": "repo_hf",
|
| 675 |
"shows": "Documents the public multi-episode access status and 32-episode pilot selection.",
|
| 676 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 677 |
{
|
| 678 |
"id": "citation",
|
| 679 |
"title": "Citation metadata",
|
|
|
|
| 129 |
"surface": "repo_hf",
|
| 130 |
"shows": "Builds synthetic verified packages for every configured backbone and audits them against the public-safe package contract.",
|
| 131 |
},
|
| 132 |
+
{
|
| 133 |
+
"id": "qwen3_omni_error_analysis_script",
|
| 134 |
+
"title": "Qwen3-Omni held-out error-analysis script",
|
| 135 |
+
"path": "scripts/omni/analyze_qwen3_omni_errors.py",
|
| 136 |
+
"kind": "scaleup_contract",
|
| 137 |
+
"surface": "repo_hf",
|
| 138 |
+
"shows": "Computes public-safe held-out error-analysis tables by episode, action family, train-seen status, required-modality state, and object category.",
|
| 139 |
+
},
|
| 140 |
{
|
| 141 |
"id": "additional_development_directions",
|
| 142 |
"title": "Additional development directions",
|
|
|
|
| 682 |
"surface": "repo_hf",
|
| 683 |
"shows": "Documents the public multi-episode access status and 32-episode pilot selection.",
|
| 684 |
},
|
| 685 |
+
{
|
| 686 |
+
"id": "qwen3_omni_error_analysis_report",
|
| 687 |
+
"title": "Qwen3-Omni held-out error-analysis report",
|
| 688 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 689 |
+
"kind": "scaleup_status",
|
| 690 |
+
"surface": "repo_hf",
|
| 691 |
+
"shows": "Summarizes validation-aware Qwen3-Omni held-out failures by episode, action family, train-seen status, required-modality state, and object category.",
|
| 692 |
+
},
|
| 693 |
+
{
|
| 694 |
+
"id": "qwen3_omni_error_analysis_json",
|
| 695 |
+
"title": "Qwen3-Omni held-out error-analysis JSON",
|
| 696 |
+
"path": "results/omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 697 |
+
"kind": "scaleup_status",
|
| 698 |
+
"surface": "repo_hf",
|
| 699 |
+
"shows": "Machine-readable Qwen3-Omni held-out error analysis with grouped metrics and sanitized failure examples.",
|
| 700 |
+
},
|
| 701 |
{
|
| 702 |
"id": "citation",
|
| 703 |
"title": "Citation metadata",
|
scripts/omni/analyze_qwen3_omni_errors.py
ADDED
|
@@ -0,0 +1,370 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""Analyze public-safe Qwen3-Omni held-out prediction errors.
|
| 3 |
+
|
| 4 |
+
The script consumes a verified public package, not raw Xperience-10M data. It
|
| 5 |
+
summarizes where the diagnostic pilot fails by episode, train-seen status,
|
| 6 |
+
coarse action family, object category, parsed prediction state, and
|
| 7 |
+
required-modality state. The outputs are small derived CSV/JSON/Markdown
|
| 8 |
+
artifacts suitable for the public package.
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
|
| 13 |
+
import argparse
|
| 14 |
+
import csv
|
| 15 |
+
import json
|
| 16 |
+
from collections import Counter, defaultdict
|
| 17 |
+
from pathlib import Path
|
| 18 |
+
from typing import Any
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
DEFAULT_PACKAGE = (
|
| 22 |
+
Path(__file__).resolve().parents[2]
|
| 23 |
+
/ "results/omni_finetune/verified_public/"
|
| 24 |
+
/ "xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval"
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
ACTION_FAMILIES = [
|
| 28 |
+
("phone_use", ("phone", "smartphone", "watch", "screen")),
|
| 29 |
+
("paper_cardboard_craft", ("paper", "cardboard", "fold", "cut", "draw", "mark", "ruler", "scissors", "lantern", "star")),
|
| 30 |
+
("retail_stocking", ("shelf", "product", "can", "canned", "container", "box", "grocery", "stock")),
|
| 31 |
+
("small_object_sorting", ("bead", "button", "tile", "mahjong", "puzzle", "piece")),
|
| 32 |
+
("cleaning", ("clean", "wipe", "wash", "vacuum", "sweep", "trash")),
|
| 33 |
+
("locomotion", ("walk", "approach", "enter", "move through", "arrive", "leave")),
|
| 34 |
+
("food_kitchen", ("kettle", "rice", "saucepan", "kitchen", "bottle", "jar", "lid")),
|
| 35 |
+
]
|
| 36 |
+
|
| 37 |
+
OBJECT_CATEGORIES = [
|
| 38 |
+
("phone_device", ("phone", "smartphone", "watch", "charger", "cable", "power bank", "earbud")),
|
| 39 |
+
("paper_cardboard", ("paper", "cardboard", "lantern", "origami", "star", "ribbon")),
|
| 40 |
+
("tool_stationery", ("scissors", "knife", "ruler", "marker", "pen", "stapler", "glue", "tape")),
|
| 41 |
+
("retail_container", ("shelf", "container", "product", "box", "can", "canned", "package", "bag")),
|
| 42 |
+
("furniture_room", ("table", "chair", "desk", "counter", "sink", "door", "wall", "floor")),
|
| 43 |
+
("food_kitchen", ("kettle", "rice", "saucepan", "jar", "bottle", "food", "kitchen")),
|
| 44 |
+
("craft_small_object", ("bead", "button", "tile", "mahjong", "puzzle", "foam", "piece")),
|
| 45 |
+
("cleaning", ("vacuum", "broom", "cloth", "towel", "trash")),
|
| 46 |
+
]
|
| 47 |
+
|
| 48 |
+
REQUIRED_VIDEO_FILES = {
|
| 49 |
+
"fisheye_cam0.mp4",
|
| 50 |
+
"fisheye_cam1.mp4",
|
| 51 |
+
"fisheye_cam2.mp4",
|
| 52 |
+
"fisheye_cam3.mp4",
|
| 53 |
+
"stereo_left.mp4",
|
| 54 |
+
"stereo_right.mp4",
|
| 55 |
+
}
|
| 56 |
+
|
| 57 |
+
REQUIRED_HDF5_MODALITIES = {
|
| 58 |
+
"calibration",
|
| 59 |
+
"slam_pose",
|
| 60 |
+
"slam_point_cloud",
|
| 61 |
+
"depth",
|
| 62 |
+
"depth_confidence",
|
| 63 |
+
"hand_mocap",
|
| 64 |
+
"body_mocap",
|
| 65 |
+
"contacts",
|
| 66 |
+
"imu",
|
| 67 |
+
"caption",
|
| 68 |
+
}
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
def parse_args() -> argparse.Namespace:
|
| 72 |
+
parser = argparse.ArgumentParser(description=__doc__)
|
| 73 |
+
parser.add_argument("--package-dir", type=Path, default=DEFAULT_PACKAGE)
|
| 74 |
+
parser.add_argument("--output-dir", type=Path)
|
| 75 |
+
parser.add_argument("--max-examples", type=int, default=12)
|
| 76 |
+
return parser.parse_args()
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
def load_json(path: Path) -> dict[str, Any]:
|
| 80 |
+
return json.loads(path.read_text(encoding="utf-8"))
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
def load_jsonl(path: Path) -> list[dict[str, Any]]:
|
| 84 |
+
rows = []
|
| 85 |
+
with path.open("r", encoding="utf-8") as handle:
|
| 86 |
+
for line in handle:
|
| 87 |
+
line = line.strip()
|
| 88 |
+
if line:
|
| 89 |
+
rows.append(json.loads(line))
|
| 90 |
+
return rows
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
def norm(value: Any) -> str:
|
| 94 |
+
return str(value or "").strip().lower()
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
def family_for(text: str, families: list[tuple[str, tuple[str, ...]]], fallback: str = "other") -> str:
|
| 98 |
+
low = norm(text)
|
| 99 |
+
for name, keywords in families:
|
| 100 |
+
if any(keyword in low for keyword in keywords):
|
| 101 |
+
return name
|
| 102 |
+
return fallback
|
| 103 |
+
|
| 104 |
+
|
| 105 |
+
def object_categories(objects: list[Any]) -> set[str]:
|
| 106 |
+
categories: set[str] = set()
|
| 107 |
+
for obj in objects:
|
| 108 |
+
categories.add(family_for(str(obj), OBJECT_CATEGORIES, "other_object"))
|
| 109 |
+
return categories or {"no_object_label"}
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
def f1(precision: float, recall: float) -> float:
|
| 113 |
+
if precision + recall == 0:
|
| 114 |
+
return 0.0
|
| 115 |
+
return 2 * precision * recall / (precision + recall)
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
def bool_metric(row: dict[str, Any], key: str) -> bool:
|
| 119 |
+
true_json = row.get("true_json") or {}
|
| 120 |
+
pred_json = row.get("pred_json") or {}
|
| 121 |
+
return norm(true_json.get(key)) == norm(pred_json.get(key)) and bool(pred_json)
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def object_overlap(row: dict[str, Any]) -> tuple[int, int, int]:
|
| 125 |
+
true_objects = {norm(item) for item in (row.get("true_json") or {}).get("objects", []) if norm(item)}
|
| 126 |
+
pred_objects = {norm(item) for item in (row.get("pred_json") or {}).get("objects", []) if norm(item)}
|
| 127 |
+
return len(true_objects & pred_objects), len(pred_objects), len(true_objects)
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
def modality_state(episode: dict[str, Any] | None) -> tuple[str, list[str]]:
|
| 131 |
+
if not episode:
|
| 132 |
+
return "episode_manifest_missing", ["episode_manifest_missing"]
|
| 133 |
+
missing: list[str] = []
|
| 134 |
+
files = {str(item.get("name")): bool(item.get("exists")) for item in episode.get("files", [])}
|
| 135 |
+
for filename in sorted(REQUIRED_VIDEO_FILES):
|
| 136 |
+
if not files.get(filename):
|
| 137 |
+
missing.append(filename)
|
| 138 |
+
hdf5 = episode.get("hdf5_modalities") or {}
|
| 139 |
+
for modality in sorted(REQUIRED_HDF5_MODALITIES):
|
| 140 |
+
if not hdf5.get(modality):
|
| 141 |
+
missing.append(modality)
|
| 142 |
+
if missing:
|
| 143 |
+
return "missing_required_modalities", missing
|
| 144 |
+
if files.get("visualization.rrd") is False:
|
| 145 |
+
return "rrd_missing_only_required_modalities_present", ["visualization.rrd"]
|
| 146 |
+
return "required_modalities_present", []
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
def add_row_stats(bucket: dict[str, Any], row: dict[str, Any]) -> None:
|
| 150 |
+
bucket["samples"] += 1
|
| 151 |
+
valid = bool(row.get("pred_json"))
|
| 152 |
+
bucket["parsed_predictions"] += int(valid)
|
| 153 |
+
bucket["action_exact"] += int(bool_metric(row, "action"))
|
| 154 |
+
bucket["subtask_exact"] += int(bool_metric(row, "subtask"))
|
| 155 |
+
bucket["transition_exact"] += int(bool_metric(row, "transition"))
|
| 156 |
+
bucket["next_action_exact"] += int(bool_metric(row, "next_action"))
|
| 157 |
+
bucket["contact_exact"] += int(bool_metric(row, "contact"))
|
| 158 |
+
matched, pred_count, true_count = object_overlap(row)
|
| 159 |
+
bucket["object_matched"] += matched
|
| 160 |
+
bucket["object_predicted"] += pred_count
|
| 161 |
+
bucket["object_true"] += true_count
|
| 162 |
+
|
| 163 |
+
|
| 164 |
+
def empty_bucket() -> dict[str, Any]:
|
| 165 |
+
return {
|
| 166 |
+
"samples": 0,
|
| 167 |
+
"parsed_predictions": 0,
|
| 168 |
+
"action_exact": 0,
|
| 169 |
+
"subtask_exact": 0,
|
| 170 |
+
"transition_exact": 0,
|
| 171 |
+
"next_action_exact": 0,
|
| 172 |
+
"contact_exact": 0,
|
| 173 |
+
"object_matched": 0,
|
| 174 |
+
"object_predicted": 0,
|
| 175 |
+
"object_true": 0,
|
| 176 |
+
}
|
| 177 |
+
|
| 178 |
+
|
| 179 |
+
def finalize_bucket(name: str, bucket: dict[str, Any]) -> dict[str, Any]:
|
| 180 |
+
samples = max(int(bucket["samples"]), 1)
|
| 181 |
+
precision = bucket["object_matched"] / bucket["object_predicted"] if bucket["object_predicted"] else 0.0
|
| 182 |
+
recall = bucket["object_matched"] / bucket["object_true"] if bucket["object_true"] else 0.0
|
| 183 |
+
return {
|
| 184 |
+
"group": name,
|
| 185 |
+
"samples": bucket["samples"],
|
| 186 |
+
"parsed_prediction_rate": bucket["parsed_predictions"] / samples,
|
| 187 |
+
"action_exact_rate": bucket["action_exact"] / samples,
|
| 188 |
+
"subtask_exact_rate": bucket["subtask_exact"] / samples,
|
| 189 |
+
"transition_exact_rate": bucket["transition_exact"] / samples,
|
| 190 |
+
"next_action_exact_rate": bucket["next_action_exact"] / samples,
|
| 191 |
+
"contact_exact_rate": bucket["contact_exact"] / samples,
|
| 192 |
+
"object_precision": precision,
|
| 193 |
+
"object_recall": recall,
|
| 194 |
+
"object_f1": f1(precision, recall),
|
| 195 |
+
}
|
| 196 |
+
|
| 197 |
+
|
| 198 |
+
def write_csv(path: Path, rows: list[dict[str, Any]]) -> None:
|
| 199 |
+
path.parent.mkdir(parents=True, exist_ok=True)
|
| 200 |
+
if not rows:
|
| 201 |
+
path.write_text("", encoding="utf-8")
|
| 202 |
+
return
|
| 203 |
+
with path.open("w", encoding="utf-8", newline="") as handle:
|
| 204 |
+
writer = csv.DictWriter(handle, fieldnames=list(rows[0].keys()), lineterminator="\n")
|
| 205 |
+
writer.writeheader()
|
| 206 |
+
writer.writerows(rows)
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
def top_rows(groups: dict[str, dict[str, Any]], *, min_samples: int = 1, reverse: bool = False) -> list[dict[str, Any]]:
|
| 210 |
+
rows = [finalize_bucket(name, bucket) for name, bucket in groups.items() if bucket["samples"] >= min_samples]
|
| 211 |
+
return sorted(rows, key=lambda row: (row["parsed_prediction_rate"], row["action_exact_rate"], row["samples"]), reverse=reverse)
|
| 212 |
+
|
| 213 |
+
|
| 214 |
+
def markdown_table(rows: list[dict[str, Any]], columns: list[str], limit: int = 8) -> list[str]:
|
| 215 |
+
selected = rows[:limit]
|
| 216 |
+
if not selected:
|
| 217 |
+
return ["No rows."]
|
| 218 |
+
lines = ["| " + " | ".join(columns) + " |", "| " + " | ".join("---" for _ in columns) + " |"]
|
| 219 |
+
for row in selected:
|
| 220 |
+
values = []
|
| 221 |
+
for col in columns:
|
| 222 |
+
value = row.get(col)
|
| 223 |
+
if isinstance(value, float):
|
| 224 |
+
values.append(f"{value:.4f}")
|
| 225 |
+
else:
|
| 226 |
+
values.append(str(value))
|
| 227 |
+
lines.append("| " + " | ".join(values) + " |")
|
| 228 |
+
return lines
|
| 229 |
+
|
| 230 |
+
|
| 231 |
+
def main() -> int:
|
| 232 |
+
args = parse_args()
|
| 233 |
+
package_dir = args.package_dir.expanduser().resolve()
|
| 234 |
+
output_dir = args.output_dir or package_dir / "analysis"
|
| 235 |
+
output_dir = output_dir.expanduser().resolve()
|
| 236 |
+
|
| 237 |
+
predictions = load_jsonl(package_dir / "eval" / "predictions.jsonl")
|
| 238 |
+
metrics = load_json(package_dir / "eval" / "metrics.json")
|
| 239 |
+
episode_manifest = load_json(package_dir / "dataset" / "episode_manifest.json")
|
| 240 |
+
episodes = {episode.get("episode_id"): episode for episode in episode_manifest.get("episodes", [])}
|
| 241 |
+
|
| 242 |
+
overall = empty_bucket()
|
| 243 |
+
by_episode: dict[str, dict[str, Any]] = defaultdict(empty_bucket)
|
| 244 |
+
by_family: dict[str, dict[str, Any]] = defaultdict(empty_bucket)
|
| 245 |
+
by_seen: dict[str, dict[str, Any]] = defaultdict(empty_bucket)
|
| 246 |
+
by_modality: dict[str, dict[str, Any]] = defaultdict(empty_bucket)
|
| 247 |
+
by_object_category: dict[str, dict[str, Any]] = defaultdict(empty_bucket)
|
| 248 |
+
invalid_examples = []
|
| 249 |
+
overgenerated_examples = []
|
| 250 |
+
modality_missing_by_episode: dict[str, list[str]] = {}
|
| 251 |
+
|
| 252 |
+
for row in predictions:
|
| 253 |
+
episode_id = str(row.get("episode_id"))
|
| 254 |
+
true_json = row.get("true_json") or {}
|
| 255 |
+
pred_json = row.get("pred_json") or {}
|
| 256 |
+
add_row_stats(overall, row)
|
| 257 |
+
add_row_stats(by_episode[episode_id], row)
|
| 258 |
+
add_row_stats(by_family[family_for(str(true_json.get("action")), ACTION_FAMILIES)], row)
|
| 259 |
+
add_row_stats(by_seen["seen_in_train" if row.get("true_label_seen_in_train") else "unseen_in_train"], row)
|
| 260 |
+
state, missing = modality_state(episodes.get(episode_id))
|
| 261 |
+
modality_missing_by_episode.setdefault(episode_id, missing)
|
| 262 |
+
add_row_stats(by_modality[state], row)
|
| 263 |
+
for category in object_categories(true_json.get("objects", [])):
|
| 264 |
+
add_row_stats(by_object_category[category], row)
|
| 265 |
+
if not pred_json and len(invalid_examples) < args.max_examples:
|
| 266 |
+
invalid_examples.append({
|
| 267 |
+
"id": row.get("id"),
|
| 268 |
+
"episode_id": episode_id,
|
| 269 |
+
"true_action": true_json.get("action"),
|
| 270 |
+
"raw_prediction_prefix": str(row.get("raw_prediction", ""))[:240],
|
| 271 |
+
})
|
| 272 |
+
pred_objects = pred_json.get("objects", []) if isinstance(pred_json, dict) else []
|
| 273 |
+
if len(pred_objects) > 20 and len(overgenerated_examples) < args.max_examples:
|
| 274 |
+
overgenerated_examples.append({
|
| 275 |
+
"id": row.get("id"),
|
| 276 |
+
"episode_id": episode_id,
|
| 277 |
+
"true_action": true_json.get("action"),
|
| 278 |
+
"predicted_object_count": len(pred_objects),
|
| 279 |
+
"first_predicted_objects": pred_objects[:20],
|
| 280 |
+
})
|
| 281 |
+
|
| 282 |
+
episode_rows = top_rows(by_episode)
|
| 283 |
+
family_rows = top_rows(by_family)
|
| 284 |
+
seen_rows = top_rows(by_seen)
|
| 285 |
+
modality_rows = top_rows(by_modality)
|
| 286 |
+
object_rows = top_rows(by_object_category)
|
| 287 |
+
|
| 288 |
+
write_csv(output_dir / "episode_error_analysis.csv", episode_rows)
|
| 289 |
+
write_csv(output_dir / "action_family_error_analysis.csv", family_rows)
|
| 290 |
+
write_csv(output_dir / "train_seen_error_analysis.csv", seen_rows)
|
| 291 |
+
write_csv(output_dir / "missing_modality_error_analysis.csv", modality_rows)
|
| 292 |
+
write_csv(output_dir / "object_category_error_analysis.csv", object_rows)
|
| 293 |
+
|
| 294 |
+
summary = {
|
| 295 |
+
"status": "pass",
|
| 296 |
+
"source_package": package_dir.name,
|
| 297 |
+
"source_prediction_rows": len(predictions),
|
| 298 |
+
"metrics_json_validity_rate": metrics.get("json_validity_rate"),
|
| 299 |
+
"computed": finalize_bucket("overall", overall),
|
| 300 |
+
"worst_episode_groups": episode_rows[:8],
|
| 301 |
+
"action_family_groups": family_rows,
|
| 302 |
+
"train_seen_groups": seen_rows,
|
| 303 |
+
"missing_modality_groups": modality_rows,
|
| 304 |
+
"object_category_groups": object_rows,
|
| 305 |
+
"invalid_json_examples": invalid_examples,
|
| 306 |
+
"object_overgeneration_examples": overgenerated_examples,
|
| 307 |
+
"modality_missing_by_episode": modality_missing_by_episode,
|
| 308 |
+
"interpretation": (
|
| 309 |
+
"The diagnostic pilot is dominated by invalid or weak structured outputs and exact-label failures. "
|
| 310 |
+
"These tables identify where to tighten JSON constraints, action/subtask target formatting, object vocabularies, "
|
| 311 |
+
"and missing-modality robustness before claiming stronger model quality."
|
| 312 |
+
),
|
| 313 |
+
}
|
| 314 |
+
(output_dir / "error_analysis_summary.json").write_text(json.dumps(summary, indent=2) + "\n", encoding="utf-8")
|
| 315 |
+
|
| 316 |
+
report = [
|
| 317 |
+
"# Qwen3-Omni Held-Out Error Analysis",
|
| 318 |
+
"",
|
| 319 |
+
"This report is computed from the verified public package predictions. It contains only derived metrics and sanitized examples.",
|
| 320 |
+
"",
|
| 321 |
+
"## Overall",
|
| 322 |
+
"",
|
| 323 |
+
f"- Prediction rows: `{len(predictions)}`",
|
| 324 |
+
f"- JSON validity from `metrics.json`: `{summary['metrics_json_validity_rate']:.4f}`",
|
| 325 |
+
f"- Parsed prediction rate from public rows: `{summary['computed']['parsed_prediction_rate']:.4f}`",
|
| 326 |
+
f"- Action exact rate: `{summary['computed']['action_exact_rate']:.4f}`",
|
| 327 |
+
f"- Subtask exact rate: `{summary['computed']['subtask_exact_rate']:.4f}`",
|
| 328 |
+
f"- Contact exact rate: `{summary['computed']['contact_exact_rate']:.4f}`",
|
| 329 |
+
f"- Object F1: `{summary['computed']['object_f1']:.4f}`",
|
| 330 |
+
"",
|
| 331 |
+
"## Weakest Episode Groups",
|
| 332 |
+
"",
|
| 333 |
+
*markdown_table(episode_rows, ["group", "samples", "parsed_prediction_rate", "action_exact_rate", "object_f1"]),
|
| 334 |
+
"",
|
| 335 |
+
"## Action Families",
|
| 336 |
+
"",
|
| 337 |
+
*markdown_table(family_rows, ["group", "samples", "parsed_prediction_rate", "action_exact_rate", "subtask_exact_rate", "object_f1"]),
|
| 338 |
+
"",
|
| 339 |
+
"## Train-Seen Split",
|
| 340 |
+
"",
|
| 341 |
+
*markdown_table(seen_rows, ["group", "samples", "parsed_prediction_rate", "action_exact_rate", "next_action_exact_rate"]),
|
| 342 |
+
"",
|
| 343 |
+
"## Required-Modality State",
|
| 344 |
+
"",
|
| 345 |
+
*markdown_table(modality_rows, ["group", "samples", "parsed_prediction_rate", "action_exact_rate", "object_f1"]),
|
| 346 |
+
"",
|
| 347 |
+
"## Object Categories",
|
| 348 |
+
"",
|
| 349 |
+
*markdown_table(object_rows, ["group", "samples", "object_precision", "object_recall", "object_f1"]),
|
| 350 |
+
"",
|
| 351 |
+
"## Interpretation",
|
| 352 |
+
"",
|
| 353 |
+
summary["interpretation"],
|
| 354 |
+
"",
|
| 355 |
+
"Generated files:",
|
| 356 |
+
"",
|
| 357 |
+
"- `error_analysis_summary.json`",
|
| 358 |
+
"- `episode_error_analysis.csv`",
|
| 359 |
+
"- `action_family_error_analysis.csv`",
|
| 360 |
+
"- `train_seen_error_analysis.csv`",
|
| 361 |
+
"- `missing_modality_error_analysis.csv`",
|
| 362 |
+
"- `object_category_error_analysis.csv`",
|
| 363 |
+
]
|
| 364 |
+
(output_dir / "ERROR_ANALYSIS.md").write_text("\n".join(report) + "\n", encoding="utf-8")
|
| 365 |
+
print(json.dumps({"status": "pass", "output_dir": str(output_dir), "prediction_rows": len(predictions)}, indent=2))
|
| 366 |
+
return 0
|
| 367 |
+
|
| 368 |
+
|
| 369 |
+
if __name__ == "__main__":
|
| 370 |
+
raise SystemExit(main())
|
scripts/validate_mirror_parity.py
CHANGED
|
@@ -30,6 +30,7 @@ DATA_FILES = [
|
|
| 30 |
"foundation_model_plan.json",
|
| 31 |
"live_publication_status.json",
|
| 32 |
"modality_atlas.json",
|
|
|
|
| 33 |
"project_brief.json",
|
| 34 |
"project_manifest.json",
|
| 35 |
"project_packet.json",
|
|
@@ -76,6 +77,7 @@ ASSET_FILES = [
|
|
| 76 |
]
|
| 77 |
|
| 78 |
SCRIPT_FILES = [
|
|
|
|
| 79 |
"audio_ablation_and_raw_upgrade.py",
|
| 80 |
"build_artifact_index.py",
|
| 81 |
"build_brand_assets.py",
|
|
@@ -122,9 +124,18 @@ RESULT_FILES = [
|
|
| 122 |
"single_episode_diagnostics/timeline_overlay/timeline_overlay.csv",
|
| 123 |
"single_episode_diagnostics/alignment_stress/alignment_shift_metrics.csv",
|
| 124 |
"single_episode_diagnostics/alignment_stress/alignment_stress_summary.json",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
]
|
| 126 |
|
| 127 |
DOC_FILES = [
|
|
|
|
|
|
|
| 128 |
"QUALITY_GATES.md",
|
| 129 |
"EVALUATION_PROTOCOL.md",
|
| 130 |
"FIGURE_INDEX.md",
|
|
|
|
| 30 |
"foundation_model_plan.json",
|
| 31 |
"live_publication_status.json",
|
| 32 |
"modality_atlas.json",
|
| 33 |
+
"omni_finetune_verified_result.json",
|
| 34 |
"project_brief.json",
|
| 35 |
"project_manifest.json",
|
| 36 |
"project_packet.json",
|
|
|
|
| 77 |
]
|
| 78 |
|
| 79 |
SCRIPT_FILES = [
|
| 80 |
+
"omni/analyze_qwen3_omni_errors.py",
|
| 81 |
"audio_ablation_and_raw_upgrade.py",
|
| 82 |
"build_artifact_index.py",
|
| 83 |
"build_brand_assets.py",
|
|
|
|
| 124 |
"single_episode_diagnostics/timeline_overlay/timeline_overlay.csv",
|
| 125 |
"single_episode_diagnostics/alignment_stress/alignment_shift_metrics.csv",
|
| 126 |
"single_episode_diagnostics/alignment_stress/alignment_stress_summary.json",
|
| 127 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/ERROR_ANALYSIS.md",
|
| 128 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/error_analysis_summary.json",
|
| 129 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/episode_error_analysis.csv",
|
| 130 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/action_family_error_analysis.csv",
|
| 131 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/train_seen_error_analysis.csv",
|
| 132 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/missing_modality_error_analysis.csv",
|
| 133 |
+
"omni_finetune/verified_public/xperience10m_qwen3_omni_128ep_96train_16val_16test_valmon_20260605_eval/analysis/object_category_error_analysis.csv",
|
| 134 |
]
|
| 135 |
|
| 136 |
DOC_FILES = [
|
| 137 |
+
"ARTIFACT_GUIDE.md",
|
| 138 |
+
"OMNI_MODEL_EXTENSION_CONTRACT.md",
|
| 139 |
"QUALITY_GATES.md",
|
| 140 |
"EVALUATION_PROTOCOL.md",
|
| 141 |
"FIGURE_INDEX.md",
|