技能已成为 Claude Code 中最常用的扩展点之一。它们灵活、易于创建且便于分发。
但这种灵活性也让人难以判断什么才是最佳实践。什么类型的技能值得去做?写出一个好技能的秘诀是什么?什么时候应该把它们分享给他人?
我们在 Anthropic 的 Claude Code 中已经广泛使用技能,目前有数百个技能在实际使用中。这些是我们在利用技能加速开发过程中总结的经验。
什么是技能?
如果你对技能还不熟悉,我建议先阅读我们的文档或观看我们在新 Skilljar 上关于 Agent Skills 的最新课程,这篇文章将假设你已经对技能有一定了解。
我们常听到的一个关于技能的误解是,它们“只是 Markdown 文件”,但技能最有趣的部分在于它们不仅仅是文本文件。它们是可以包含脚本、资源、数据等的文件夹,代理可以发现、探索并操作这些内容。
在 Claude Code 中,技能还具有多种配置选项,包括注册动态钩子。
我们发现,在 Claude Code 中,一些最有趣的技能会创造性地使用这些配置选项和文件夹结构。
技能类型
在列出我们所有的技能后,我们发现它们聚集成几个反复出现的类别。最好的技能清晰地归入一个类别;而更具混乱性的技能则跨越多个类别。这不是一个定论性的列表,但它是一个很好的思路,可以帮助你思考是否有任何技能在你组织内缺失。
1. 库与 API 参考
解释如何正确使用库、命令行工具(CLI)或 SDK 的技能。这些技能既可以是针对内部库的,也可以是针对常见库的,Claude Code 有时会遇到问题。这些技能通常包括一组参考代码片段以及 Claude 在编写脚本时需要避免的陷阱清单。
示例:
billing-lib — 你的内部账单库:边缘案例,易错点等。
internal-platform-cli — 你内部CLI封装的每个子命令及其使用示例
frontend-design — 让Claude更好地适应你的设计系统
2. 产品验证
用于描述如何测试或验证你的代码是否正常运行的技能。这些通常会与像 playwright、tmux 等外部工具配合使用来进行验证。
验证技能对于确保 Claude 的输出是正确的非常有用。让一名工程师花一周时间专门把验证技能打磨到优秀水平是很值得的。
可以考虑一些技术手段,比如让 Claude 录制其输出过程的视频,这样你可以准确看到它测试了什么,或者在每一步对状态强制执行程序化断言。这些通常通过在技能中包含各种脚本来实现。
示例:
signup-flow-driver — 通过无头浏览器执行注册 → 邮箱验证 → 入职流程,并在每个步骤提供钩子以验证状态
checkout-verifier — 使用Stripe测试卡驱动结账界面,验证发票是否正确进入预期状态
tmux-cli-driver — 用于交互式CLI测试,在这种测试中,你验证的内容需要TTY
3. 数据获取与分析
连接到您的数据和监控技术栈的技能。这些技能可能包括使用凭证获取数据的库、特定的仪表板 ID 等,以及关于常见工作流程或获取数据方式的说明。
示例:
funnel-query — "我需要关联哪些事件才能查看 signup → activation → paid" 以及实际包含规范 user_id 的数据表
cohort-compare — 比较两个群体的留存或转化率,标记统计显著的差异,链接到细分定义
grafana — 数据源 UID、集群名称、问题 → 仪表板查找表
4. 业务流程与团队自动化
自动化重复工作流的技能,集成到一个指令中。这些技能通常是相对简单的指令,但可能依赖于其他技能或MCP的更复杂条件。对于这些技能,保存先前的结果到日志文件中有助于模型保持一致,并反映工作流的先前执行情况。
示例:
standup-post — 聚合你的票据跟踪器、GitHub 活动和先前的 Slack → 格式化的站立会议,只有变化部分
create-<ticket-system>-ticket — 强制执行架构(有效的枚举值、必填字段),以及创建后的工作流程(提醒审阅者,链接到 Slack)
weekly-recap — 合并的 PR + 关闭的票据 + 部署 → 格式化的回顾帖子
5. 代码脚手架与模板
用于在代码库中为特定功能生成框架样板代码的技能。你可以将这些技能与可组合的脚本结合使用。当你的脚手架包含无法仅通过代码完全覆盖的自然语言需求时,它们尤其有用。
示例:
new-<framework>-workflow —— 使用你的注释生成一个新的服务/工作流/处理器的脚手架
new-migration — 你的迁移文件模板加上常见的注意事项
create-app — 新的内部应用,预先配置了你的认证、日志和部署设置
6. 代码质量与审核
强制执行代码质量并帮助审查代码的技能。这些技能可以包括确定性的脚本或工具,以确保最大稳健性。你可能希望将这些技能作为钩子的一部分或在 GitHub Action 内部自动运行。
对抗性审查——生成一个“新视角”子代理进行批评,实施修复,并反复迭代,直到问题退化为吹毛求疵
代码风格——强制执行代码风格,尤其是 Claude 默认不擅长的风格。
测试实践——关于如何编写测试以及测试内容的说明。
7. 持续集成/持续部署(CI/CD)与部署
帮助你在代码库中获取、推送和部署代码的技能。这些技能可能会引用其他技能来收集数据。
示例:
babysit-pr — 监控一个 PR → 重试不稳定的 CI → 解决合并冲突 → 启用自动合并
deploy-<service> — 构建 → 烟雾测试 → 逐步流量发布并进行错误率对比 → 回归时自动回滚
cherry-pick-prod — 独立工作树 → cherry-pick → 冲突解决 → 带模板的 PR
8. 操作手册
能够处理一个症状(例如 Slack 线程、警报或错误特征),通过多工具调查,并生成结构化报告的技能。
示例:
<service>-debugging —— 将症状映射到工具和查询模式,用于你流量最高的服务
oncall-runner —— 获取告警 → 检查常见问题 → 输出分析结果
log-correlator —— 给定请求 ID,从所有可能处理过该请求的系统中提取匹配日志
9. 基础设施运维
执行日常维护和操作流程的技能——其中一些涉及具有破坏性的操作,因此需要设置防护措施。这些技能让工程师在关键操作中更容易遵循最佳实践。
示例:
<resource>-orphans —— 查找孤立的 pods/volumes → 发布到 Slack → 观察期 → 用户确认 → 级联清理
dependency-management —— 你所在组织的依赖项审批工作流程
成本调查 — "为什么我们的存储/出口账单激增?" 具体的存储桶和查询模式
制作技能的技巧
一旦你决定了要制作的技能,如何编写它?以下是我们发现的一些最佳实践、技巧和窍门。
我们最近还发布了技能创建者,以便更容易在Claude Code中创建技能。
不要陈述显而易见的内容
Claude Code 对你的代码库了解很多,Claude 对编程也非常熟悉,包括许多默认的观点。如果你发布的技能主要是关于知识的,尽量关注那些能让 Claude 跳出常规思维的信息。
前端设计技能是一个很好的例子——它由 Anthropic 的一位工程师开发,通过与客户迭代,提升 Claude 的设计品味,避免经典模式,如 Inter 字体和紫色渐变。
建立一个注意事项板块
任何技能中信息密度最高的内容都是“注意事项(Gotchas)”部分。这些部分应基于 Claude 在使用你的技能时常遇到的失败点来构建。理想情况下,你应随着时间推移不断更新你的技能,以收集这些注意事项。
使用文件系统与渐进式披露
正如我们之前所说,技能是一个文件夹,而不仅仅是一个 Markdown 文件。你应该将整个文件系统视为一种上下文工程和渐进式披露的形式。告诉 Claude 你的技能中包含哪些文件,它会在合适的时候读取它们。
渐进式披露最简单的形式是将 Claude 指向其他可用的 Markdown 文件。例如,你可以将详细的函数签名和使用示例拆分到 references/api.md 中。
另一个例子:如果你的最终输出是一个 Markdown 文件,你可能会在 assets/ 中包含一个模板文件以便复制和使用。
你可以拥有参考资料、脚本、示例等的文件夹,这有助于 Claude 更高效地工作。
避免限制 Claude 的自由发挥
Claude 通常会尽量遵循你的指示,并且由于技能非常可重用,你需要小心不要在指示中过于具体。向 Claude 提供它所需的信息,但同时给予它根据情况调整的灵活性。例如:
仔细考虑设置
有些技能可能需要从用户处获取上下文信息来进行设置。例如,如果你正在创建一个将你的站会内容发布到 Slack 的技能,你可能希望 Claude 询问要发布到哪个 Slack 频道。
一个好的做法是将这些设置信息存储在技能目录下的 config.json 文件中,如上例所示。如果配置未设置,代理可以随后向用户询问信息。
如果你希望代理展示结构化的多选问题,你可以指示 Claude 使用 AskUserQuestion 工具。
描述字段是为了模型
当Claude Code启动会话时,它会建立一个包含每个可用技能及其描述的列表。Claude扫描此列表以决定‘是否有技能适合此请求?’这意味着描述字段不是摘要——它是描述何时触发此PR。
记忆与数据存储
某些技能可以通过存储数据的方式包括某种形式的记忆。你可以将数据存储在简单的附加文本日志文件或JSON文件中,或者存储在复杂的SQLite数据库中。
例如,一个站点发布技能可能会将每个发布记录在 standups.log 中,这意味着下次运行时,Claude 会读取自己的历史记录,并能够告诉自昨天以来发生了什么变化。
当您升级技能时,存储在技能目录中的数据可能会被删除,因此您应该将其存储在一个稳定的文件夹中。从今天起,我们为每个插件提供 `${CLAUDE_PLUGIN_DATA}` 作为稳定文件夹来存储数据。
存储脚本并生成代码
您可以给予 Claude 最强大的工具之一就是代码。提供脚本和库给 Claude,让 Claude 将回合用于创作,决定接下来做什么,而不是重构样板代码。
例如,在你的数据科学技能中,你可能会有一个函数库,用于从你的事件源获取数据。为了让 Claude 能够进行复杂分析,你可以像这样为它提供一组辅助函数:
然后,Claude 可以动态生成脚本,将这些功能组合起来,以便针对像“星期二发生了什么?”这样的提示执行更高级的分析。
按需钩子
技能可以包含仅在调用该技能时才会激活的钩子,并在整个会话期间持续生效。将其用于那些你不希望一直运行、但在某些情况下非常有用的更具主观性的钩子。
例如:
/careful — 通过 Bash 上的 PreToolUse 匹配器拦截 rm -rf、DROP TABLE、force-push、kubectl delete。只有在你确定正在操作生产环境时才会启用它——如果一直开着会把你逼疯
/freeze — 阻止任何不在指定目录中的 Edit/Write 操作。很有用
在调试时:"我想加一些日志,但总是不小心‘修复’了无关的内容"
分发技能
技能的最大好处之一是你可以与团队的其他成员分享它们。
你可以通过两种方式与他人分享技能:
将你的技能提交到你的代码库(在 ./.claude/skills 目录下)
创建一个插件,并建立一个 Claude Code 插件市场,让用户可以上传和安装插件(在文档中了解更多信息)
对于在相对较少的代码仓库中协作的小团队来说,将技能直接提交到仓库中效果很好。但每一个被提交的技能都会在一定程度上增加模型的上下文负担。随着规模扩大,内部插件市场可以让你分发技能,并让团队自行决定安装哪些技能。
管理一个市场
你如何决定哪些技能应该进入市场?人们又该如何提交它们?
我们没有一个集中的团队来做决定;相反,我们尝试自然而然地发现最有用的技能。如果你有一个希望别人尝试的技能,可以将它上传到 GitHub 的沙盒文件夹,并在 Slack 或其他论坛中指引大家去使用。
一旦某个技能获得了关注(由技能拥有者自行决定),他们可以提交一个 PR 将其移入市场。
需要注意的是,创建低质量或重复的技能可能非常容易,因此在发布前确保有某种策划方法是很重要的。
技能创作
你可能希望拥有相互依赖的技能。例如,你可能有一个文件上传技能用于上传文件,还有一个CSV生成技能用于生成CSV并上传。这种依赖管理目前尚未在市场或技能系统中原生支持,但你可以通过名称引用其他技能,模型会在这些技能已安装的情况下调用它们。
技能衡量
为了了解一个技能的表现情况,我们使用一个PreToolUse钩子,使我们能够在公司内部记录技能的使用情况(示例代码见此)。这意味着我们可以找出哪些技能很受欢迎,或者哪些技能的触发频率低于我们的预期。
结论
技能对代理来说是非常强大且灵活的工具,但目前仍处于早期阶段,我们都在探索如何最好地使用它们。
把它更多地看作是我们看到有效的一些实用小技巧,而不是权威指南。理解技能的最佳方式是开始尝试、实验,并看看什么对你有效。我们的大多数技能最初只是几行代码和一个小陷阱,但随着人们在Claude遇到新的边缘情况时不断添加,它们变得越来越好。
希望这对你有帮助,如果有任何问题,请告诉我。
显示英文原文 / Show English Original
Skills have become one of the most used extension points in Claude Code. They’re flexible, easy to make, and simple to distribute. But this flexibility also makes it hard to know what works best. What type of skills are worth making? What's the secret to writing a good skill? When do you share them with others? We've been using skills in Claude Code extensively at Anthropic with hundreds of them in active use. These are the lessons we've learned about using skills to accelerate our development. What are Skills? If you’re new to skills, I’d recommend reading our docs or watching our newest course on new Skilljar on Agent Skills, this post will assume you already have some familiarity with skills. A common misconception we hear about skills is that they are “just markdown files”, but the most interesting part of skills is that they’re not just text files. They’re folders that can include scripts, assets, data, etc. that the agent can discover, explore and manipulate. In Claude Code, skills also have a wide variety of configuration options including registering dynamic hooks. We’ve found that some of the most interesting skills in Claude Code use these configuration options and folder structure creatively.
Types of Skills After cataloging all of our skills, we noticed they cluster into a few recurring categories. The best skills fit cleanly into one; the more confusing ones straddle several. This isn't a definitive list, but it is a good way to think about if you're missing any inside of your org. 1. Library & API Reference Skills that explain how to correctly use a library, CLI, or SDKs. These could be both for internal libraries or common libraries that Claude Code sometimes has trouble with. These skills often included a folder of reference code snippets and a list of gotchas for Claude to avoid when writing a script. Examples: billing-lib — your internal billing library: edge cases, footguns, etc. internal-platform-cli — every subcommand of your internal CLI wrapper with examples on when to use them frontend-design — make Claude better at your design system
2. Product Verification Skills that describe how to test or verify that your code is working. These are often paired with an external tool like playwright, tmux, etc. for doing the verification. Verification skills are extremely useful for ensuring Claude's output is correct. It can be worth having an engineer spend a week just making your verification skills excellent. Consider techniques like having Claude record a video of its output so you can see exactly what it tested, or enforcing programmatic assertions on state at each step. These are often done by including a variety of scripts in the skill. Examples: signup-flow-driver — runs through signup → email verify → onboarding in a headless browser, with hooks for asserting state at each step checkout-verifier — drives the checkout UI with Stripe test cards, verifies the invoice actually lands in the right state tmux-cli-driver — for interactive CLI testing where the thing you're verifying needs a TTY
3. Data Fetching & Analysis Skills that connect to your data and monitoring stacks. These skills might include libraries to fetch your data with credentials, specific dashboard ids, etc. as well as instructions on common workflows or ways to get data. Examples: funnel-query — "which events do I join to see signup → activation → paid" plus the table that actually has the canonical user_id cohort-compare — compare two cohorts' retention or conversion, flag statistically significant deltas, link to the segment definitions grafana — datasource UIDs, cluster names, problem → dashboard lookup table 4. Business Process & Team Automation Skills that automate repetitive workflows into one command. These skills are usually fairly simple instructions but might have more complicated dependencies on other skills or MCPs. For these skills, saving previous results in log files can help the model stay consistent and reflect on previous executions of the workflow.
Examples: standup-post — aggregates your ticket tracker, GitHub activity, and prior Slack → formatted standup, delta-only create-<ticket-system>-ticket — enforces schema (valid enum values, required fields) plus post-creation workflow (ping reviewer, link in Slack) weekly-recap — merged PRs + closed tickets + deploys → formatted recap post 5. Code Scaffolding & Templates Skills that generate framework boilerplate for a specific function in codebase. You might combine these skills with scripts that can be composed. They are especially useful when your scaffolding has natural language requirements that can’t be purely covered by code. Examples: new-<framework>-workflow — scaffolds a new service/workflow/handler with your annotations
new-migration — your migration file template plus common gotchas create-app — new internal app with your auth, logging, and deploy config pre-wired 6. Code Quality & Review Skills that enforce code quality inside of your org and help review code. These can include deterministic scripts or tools for maximum robustness. You may want to run these skills automatically as part of hooks or inside of a GitHub Action. adversarial-review — spawns a fresh-eyes subagent to critique, implements fixes, iterates until findings degrade to nitpicks code-style — enforces code style, especially styles that Claude does not do well by default. testing-practices — instructions on how to write tests and what to test. 7. CI/CD & Deployment
Skills that help you fetch, push, and deploy code inside of your codebase. These skills may reference other skills to collect data. Examples: babysit-pr — monitors a PR → retries flaky CI → resolves merge conflicts → enables auto-merge deploy-<service> — build → smoke test → gradual traffic rollout with error-rate comparison → auto-rollback on regression cherry-pick-prod — isolated worktree → cherry-pick → conflict resolution → PR with template 8. Runbooks Skills that take a symptom (such as a Slack thread, alert, or error signature), walk through a multi-tool investigation, and produce a structured report. Examples:
<service>-debugging — maps symptoms → tools → query patterns for your highest-traffic services oncall-runner — fetches the alert → checks the usual suspects → formats a finding log-correlator — given a request ID, pulls matching logs from every system that might have touched it 9. Infrastructure Operations Skills that perform routine maintenance and operational procedures — some of which involve destructive actions that benefit from guardrails. These make it easier for engineers to follow best practices in critical operations. Examples: <resource>-orphans — finds orphaned pods/volumes → posts to Slack → soak period → user confirms → cascading cleanup dependency-management — your org's dependency approval workflow
cost-investigation — "why did our storage/egress bill spike" with the specific buckets and query patterns Tips for Making Skills Once you've decided on the skill to make, how do you write it? These are some of the best practices, tips, and tricks we've found. We also recently released Skill Creator to make it easier to create skills in Claude Code. Don’t State the Obvious Claude Code knows a lot about your codebase, and Claude knows a lot about coding, including many default opinions. If you’re publishing a skill that is primarily about knowledge, try to focus on information that pushes Claude out of its normal way of thinking. The frontend design skill is a great example — it was built by one of the engineers at Anthropic by iterating with customers on improving Claude’s design taste, avoiding classic patterns like the Inter font and purple gradients. Build a Gotchas Section
The highest-signal content in any skill is the Gotchas section. These sections should be built up from common failure points that Claude runs into when using your skill. Ideally, you will update your skill over time to capture these gotchas. Use the File System & Progressive Disclosure Like we said earlier, a skill is a folder, not just a markdown file. You should think of the entire file system as a form of context engineering and progressive disclosure. Tell Claude what files are in your skill, and it will read them at appropriate times. The simplest form of progressive disclosure is to point to other markdown files for Claude to use. For example, you may split detailed function signatures and usage examples into references/api.md. Another example: if your end output is a markdown file, you might include a template file for it in assets/ to copy and use. You can have folders of references, scripts, examples, etc., which help Claude work more effectively. Avoid Railroading Claude Claude will generally try to stick to your instructions, and because Skills are so reusable you’ll want to be careful of being too specific in your instructions. Give Claude the information it needs, but give it the flexibility to adapt to the situation. For example:
Think through the Setup Some skills may need to be set up with context from the user. For example, if you are making a skill that posts your standup to Slack, you may want Claude to ask which Slack channel to post it in. A good pattern to do this is to store this setup information in a config.json file in the skill directory like the above example. If the config is not set up, the agent can then ask the user for information. If you want the agent to present structured, multiple choice questions you can instruct Claude to use the AskUserQuestion tool. The Description Field Is For the Model When Claude Code starts a session, it builds a listing of every available skill with its description. This listing is what Claude scans to decide "is there a skill for this request?" Which means the description field is not a summary — it's a description of when to trigger this PR. Memory & Storing Data Some skills can include a form of memory by storing data within them. You could store data in anything as simple as an append only text log file or JSON files, or as complicated as a SQLite database.
For example, a standup-post skill might keep a standups.log with every post it's written, which means the next time you run it, Claude reads its own history and can tell what's changed since yesterday. Data stored in the skill directory may be deleted when you upgrade the skill, so you should store this in a stable folder, as of today we provide `${CLAUDE_PLUGIN_DATA}` as a stable folder per plugin to store data in. Store Scripts & Generate Code One of the most powerful tools you can give Claude is code. Giving Claude scripts and libraries lets Claude spend its turns on composition, deciding what to do next rather than reconstructing boilerplate. For example, in your data science skill you might have a library of functions to fetch data from your event source. In order for Claude to do complex analysis, you could give it a set of helper functions like so: Claude can then generate scripts on the fly to compose this functionality to do more advanced analysis for prompts like “What happened on Tuesday?” On Demand Hooks Skills can include hooks that are only activated when the skill is called, and last for the duration of the session. Use this for more opinionated hooks that you don’t want to run all the time, but are extremely useful sometimes.
For example: /careful — blocks rm -rf, DROP TABLE, force-push, kubectl delete via PreToolUse matcher on Bash. You only want this when you know you're touching prod — having it always on would drive you insane /freeze — blocks any Edit/Write that's not in a specific directory. Useful when debugging: "I want to add logs but I keep accidentally 'fixing' unrelated Distributing Skills One of the biggest benefits of Skills is that you can share them with the rest of your team. There are two ways you might to share skills with others: check your skills into your repo (under ./.claude/skills)
make a plugin and have a Claude Code Plugin marketplace where users can upload and install plugins (read more on the documentation here) For smaller teams working across relatively few repos, checking your skills into repos works well. But every skill that is checked in also adds a little bit to the context of the model. As you scale, an internal plugin marketplace allows you to distribute skills and let your team decide which ones to install. Managing a Marketplace How do you decide which skills go in a marketplace? How do people submit them? We don't have a centralized team that decides; instead we try and find the most useful skills organically. If you have a skill that you want people to try out, you can upload it to a sandbox folder in GitHub and point people to it in Slack or other forums. Once a skill has gotten traction (which is up to the skill owner to decide), they can put in a PR to move it into the marketplace. A note of warning, it can be quite easy to create bad or redundant skills, so making sure you have some method of curation before release is important. Composing Skills
You may want to have skills that depend on each other. For example, you may have a file upload skill that uploads a file, and a CSV generation skill that makes a CSV and uploads it. This sort of dependency management is not natively built into marketplaces or skills yet, but you can just reference other skills by name, and the model will invoke them if they are installed. Measuring Skills To understand how a skill is doing, we use a PreToolUse hook that lets us log skill usage within the company (example code here). This means we can find skills that are popular or are undertriggering compared to our expectations. Conclusion Skills are incredibly powerful, flexible tools for agents, but it’s still early and we’re all figuring out how to use them best. Think of this more as a grab bag of useful tips that we’ve seen work than a definitive guide. The best way to understand skills is to get started, experiment, and see what works for you. Most of ours began as a few lines and a single gotcha, and got better because people kept adding to them as Claude hit new edge cases. I hope this was helpful, let me know if you have any questions.