dfdl_ref — DataFusion API reference dfdl_ref, CodeAnatomy, community, DataFusion API reference, ide skills, dfdl_ref install, dfdl_ref documentation, Claude Code, Cursor, Windsurf

v1.0.0
GitHub

About this Skill

Perfect for Data Analysis Agents needing comprehensive references for DataFusion, DeltaLake, and PyArrow APIs dfdl_ref is a technical reference skill for DataFusion, PyArrow, and DeltaLake APIs, providing operating rules and guidelines for efficient development.

Features

Provides reference maps for Core DataFusion Python surfaces (IO, catalog, SQL, DataFrame API)
Includes guidelines for probing local environments and searching repositories for existing API usage
Offers best-in-class deployment strategies for DataFusion and DeltaLake
Supports implementation using existing local patterns for PyArrow and UDF APIs
Features a comprehensive reference file (reference/datafusion.md) for DataFusion API details

# Core Topics

paul-heyse paul-heyse
[0]
[0]
Updated: 3/8/2026

Agent Capability Analysis

The dfdl_ref skill by paul-heyse is an open-source community AI agent skill for Claude Code and other IDE workflows, helping agents execute tasks with better context, repeatability, and domain-specific guidance. Optimized for DataFusion API reference, dfdl_ref install, dfdl_ref documentation.

Ideal Agent Persona

Perfect for Data Analysis Agents needing comprehensive references for DataFusion, DeltaLake, and PyArrow APIs

Core Value

Empowers agents to efficiently develop with DataFusion, DeltaLake, and PyArrow by providing guidelines and references for their APIs, including Core DataFusion Python surfaces and best-in-class deployment patterns, ensuring accurate and reliable interactions with these technologies

Capabilities Granted for dfdl_ref

Debugging DataFusion and DeltaLake integrations
Optimizing PyArrow data processing workflows
Implementing efficient DataFusion and DeltaLake APIs using existing local patterns

! Prerequisites & Limits

  • Requires access to local environment for version and method probing
  • Limited to DataFusion, DeltaLake, and PyArrow APIs
Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

dfdl_ref

Install dfdl_ref, an AI agent skill for AI agent workflows and automation. Works with Claude Code, Cursor, and Windsurf with one-command setup.

SKILL.md
Readonly

Operating rule: never guess DataFusion/DeltaLake/PyArrow/UDF APIs

When uncertain:

  1. Probe local environment (versions + available methods).
  2. Search the repo for how we already use it.
  3. Open the relevant reference file below (only the section you need).
  4. Implement using existing local patterns unless the plan says otherwise.

Reference map (open these files as needed)

  • Core DataFusion Python surfaces (IO, catalog, SQL, DataFrame API): reference/datafusion.md
  • "Best-in-class deployment gaps" (caching, stats, observability, planning knobs): reference/datafusion_addendum.md
  • Planning deep dive (logical/physical plan pipeline, introspection, optimization rules): reference/datafusion_planning.md
  • Rust UDF contracts (Scalar/UDAF/UDWF/Async/named args): reference/datafusion_rust_UDFs.md
  • Schema management + schema pitfalls: reference/datafusion_schema.md
  • DeltaLake ↔ DataFusion integration details: reference/deltalake_datafusion_integration.md
  • Advanced Rust integration (PyO3 packaging, wheels, CI, native module distribution): reference/datafusion_deltalake_advanced_rust_integration.md
  • DataFusionMixins trait (Delta snapshot schema + predicate parsing helpers): reference/deltalake_datafusionmixins.md
  • Plan combination (composing DataFusion plans via joins/unions/CTEs, Delta integration, parameterized queries, plan serialization): reference/datafusion_plan_combination.md
  • Rust LogicalPlan programmatic construction (LogicalPlanBuilder, Expr, schema/DFSchema, plan rewriting via TreeNode, extensibility, serialization): reference/Datafusion_logicplan_rust.md
  • DataFusion tracing (Rust community extension: execution spans, metrics capture, partial-result previews, rule-phase instrumentation, OpenTelemetry export): reference/datafusion-tracing.md
  • DeltaLake core (format/protocol, client APIs, 3-layer model): reference/deltalake.md

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is dfdl_ref?

Perfect for Data Analysis Agents needing comprehensive references for DataFusion, DeltaLake, and PyArrow APIs dfdl_ref is a technical reference skill for DataFusion, PyArrow, and DeltaLake APIs, providing operating rules and guidelines for efficient development.

How do I install dfdl_ref?

Run the command: npx killer-skills add paul-heyse/CodeAnatomy/dfdl_ref. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for dfdl_ref?

Key use cases include: Debugging DataFusion and DeltaLake integrations, Optimizing PyArrow data processing workflows, Implementing efficient DataFusion and DeltaLake APIs using existing local patterns.

Which IDEs are compatible with dfdl_ref?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for dfdl_ref?

Requires access to local environment for version and method probing. Limited to DataFusion, DeltaLake, and PyArrow APIs.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add paul-heyse/CodeAnatomy/dfdl_ref. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use dfdl_ref immediately in the current project.

Related Skills

Looking for an alternative to dfdl_ref or another community skill for your workflow? Explore these related open-source skills.

View All

widget-generator

Logo of f
f

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

149.6k
0
AI

flags

Logo of vercel
vercel

flags is a Next.js feature management skill that enables developers to efficiently add or modify framework feature flags, streamlining React application development.

138.4k
0
Browser

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI