THEORETICAL VALIDATION
Comparative analysis of structured task datasets for robotics
ACADEMIC VALIDATION
Recent research demonstrates the value of manual-derived procedural knowledge for robotics:
CheckManual (CoRL 2025)
Robots operating novel appliances by grounding in user manuals achieved 30-50% improvement in zero-shot success rates.
ApBot (ICRA 2025)
Structured procedures from appliance documentation significantly improved task completion compared to demonstration-only learning.
RoboCasa / RoboCasa365
Sim-to-real transfer benefits from procedural fidelity — synthetic tasks alone miss real appliance complexity.
"Every major humanoid effort highlights the same gap: robots lack structured task data for real-world objects."
DATASET COMPARISON
| DATASET | SCALE | STRUCTURED | CONSTRAINTS | FAILURE MODES |
|---|---|---|---|---|
| Syngraph | 500+ procedures | |||
| Open X-Embodiment | 1M+ trajectories | |||
| DROID | 76k trajectories | |||
| RH20T | ~200 tasks | |||
| RoboMIND | ~150 primitives |
Trajectory datasets show what happened. Syngraph specifies what should happen — and what to do when it doesn't.
KEY DIFFERENTIATORS
SCALE
Largest structured procedure dataset for robotics. Comprehensive coverage across household appliance categories.
STRUCTURE
Formal schema with explicit preconditions, actions, effects. Directly executable by planners — no interpretation layer needed.
SAFETY
Hazards, failure modes, and recovery procedures extracted from manufacturer documentation. Audit-ready for regulated deployments.
PROVENANCE
Every fact traces to source. No hallucination. No invented procedures. Verifiable ground truth.
METHODOLOGY
TWO-PASS EXTRACTION
Pass 1: Structural Extraction
Procedures, controls, parts from document layout
Pass 2: Semantic Enrichment
Preconditions, constraints, failure modes, cross-references
QUALITY PIPELINE
- 1. Schema validation (M-IDL v1.3.0)
- 2. Primitive validation (approved ontology only)
- 3. Provenance verification (source page linking)
- 4. Composite quality scoring
- 5. Sample human review
Current benchmark: 0.988 quality score, zero hallucinated primitives.
COLLABORATION
We are developing public benchmarks for:
- • Procedure coverage metrics
- • Cross-appliance transfer evaluation
- • Planning efficiency comparisons
Academic collaborations welcome.
Contact: research@syngraph.io