Skip to content

Conversation

@keugenek
Copy link
Contributor

Summary

  • Add new MCP tool databricks_query_sdk_docs that allows LLM agents to search Databricks SDK documentation
  • Addresses the problem where LLMs struggle with SDK APIs and guess incorrectly (e.g., using requests.get(...) instead of proper SDK methods)
  • Generated documentation index from annotations_openapi.yml with 7 core services, 277 types, and 3 enums

Features

  • Fuzzy/keyword search across services, methods, types, and enums
  • Filtering by category (services, methods, types, enums) and service name
  • Score-based ranking for relevant results
  • LLM-friendly output with method signatures, parameters, return types, and examples

Example Usage

Query: "how to create a job"
Response: Jobs.Create method with signature, parameters, return type, and example code

Test plan

  • Unit tests pass: go test ./experimental/aitools/lib/providers/sdkdocs/...
  • Linter passes: make lint
  • CLI builds successfully: go build -o cli .
  • Manual testing with MCP client

Related

Discussion in #db-agent-builder about APX MCP providing SDK documentation indexing for better LLM interactions.

🤖 Generated with Claude Code

Add a new MCP tool `databricks_query_sdk_docs` that allows LLM agents
to search Databricks SDK documentation for methods, types, and examples.

This addresses the problem where LLMs struggle with the Databricks SDK
because they lack indexed documentation. Instead of guessing API calls,
agents can now query for proper method signatures, parameters, and usage.

Features:
- Fuzzy/keyword search across services, methods, types, and enums
- Category and service filtering
- Score-based result ranking
- LLM-friendly markdown output with signatures and examples

Implementation:
- New sdkdocs provider with embedded JSON documentation index
- Index generator tool that parses annotations_openapi.yml
- Generated index includes 7 core services, 277 types, and 3 enums
- Full unit test coverage for search and index loading

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Jan 12, 2026

Commit: 7a2f2ac

Run: 21174052176

Env 🟨​KNOWN 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 16 1 3 411 695 159:36
🟨​ aws-ucws linux 15 5 2 583 569 195:33
🟨​ aws-ucws windows 15 5 2 585 567 200:42
🟨​ azure linux 13 1 4 411 694 205:14
🟨​ azure windows 13 1 4 413 692 202:20
🟨​ azure-ucws linux 16 1 3 579 568 234:40
🟨​ gcp linux 13 1 4 400 700 202:37
24 interesting tests: 22 KNOWN, 1 SKIP, 1 RECOVERED
Test Name aws linux aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux gcp linux
🟨​ TestAccept 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/deployment/bind/alert 🙈​S 🙈​S 🙈​S 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/deployment/bind/alert/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/deployment/bind/alert/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/generate/alert 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/basic 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/with_file 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 🟨​K 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🟨​K 🟨​K 🙈​S 🙈​S 🟨​K 🙈​S
🟨​ TestAccept/bundle/resources/synced_database_tables/basic/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/synced_database_tables/basic/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 🟨​K
💚​ TestAccept/ssh/connection 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 28 slowest tests (at least 2 minutes):
duration env testname
7:09 azure windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
6:40 azure linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
6:29 azure linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:41 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
5:31 aws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:11 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:07 azure-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
4:49 azure-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
4:22 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:17 aws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
3:14 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:11 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:10 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:04 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:52 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:42 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:41 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:38 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:28 aws-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
2:26 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:25 aws-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
2:24 aws-ucws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
2:15 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.12
2:15 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:11 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:09 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:09 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:04 azure-ucws linux TestAccept/bundle/resources/experiments/basic/DATABRICKS_BUNDLE_ENGINE=terraform

Adds a skill that helps Claude Code users discover and use
the databricks_query_sdk_docs MCP tool effectively when
asking about SDK methods, types, and parameters.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@keugenek keugenek changed the title aitools: Add SDK documentation query tool for MCP server [PoC] aitools: Add SDK documentation query tool for MCP server Jan 12, 2026
@fjakobs
Copy link
Contributor

fjakobs commented Jan 13, 2026

@keugenek does it have to be an MCP tools? I'd prefer this to be a command under databricks experimental aitools tools

This commit enhances the SDK documentation generator to parse the
actual Go SDK source code instead of using hardcoded service definitions.

Changes:
- Rewrote tools/gen_sdk_docs_index.go to parse SDK using go/ast
- Extracts service interfaces, method signatures, and descriptions
- Parses struct types and enums automatically
- Added tools/verify_sdk_docs_index.py for CI staleness check
- Added Makefile targets: sdk-docs-index and verify-sdk-docs-index

Results:
- Previous: 7 services, 277 types, 3 enums (mostly hardcoded)
- Now: 11 services, 1302 types, 263 enums (auto-generated)

Usage:
- make sdk-docs-index          # Regenerate index
- make verify-sdk-docs-index   # Check if index is up to date

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit adds GitHub Actions workflows to automatically manage
the SDK documentation index:

1. check.yml: Added verification step to fail PRs with stale index
   - Runs `make verify-sdk-docs-index` on every PR

2. update-sdk-docs.yml: New workflow for automatic updates
   - Triggers on: manual dispatch, daily schedule, go.mod changes
   - Auto-commits to main when SDK version changes via push
   - Creates PR for scheduled/manual triggers if changes detected
   - Includes SDK version in commit messages

This ensures the SDK docs index stays up to date when:
- Dependabot bumps the SDK version
- Manual SDK updates are made
- Daily scheduled checks detect drift

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants