← 返回 Skills 市场
pintudeyudi

Huawei Cloud Cce Change Impact Analyzer

作者 shijingcheng · GitHub ↗ · v0.1.0 · MIT-0
cross-platform ⚠ suspicious
24
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install huawei-cloud-cce-change-impact-analyzer
功能描述
Huawei Cloud CCE change impact analysis skill that converts "what changed before the incident" into provable causal attribution. Use this skill when a CCE in...
使用说明 (SKILL.md)

\r \r

CCE Change Impact Analyzer\r

\r

⚠️ Execution Method (Must Read): This skill executes diagnosis via local Python scripts using the scripts/huawei-cloud.py dispatcher. Using hcloud, kubectl, or other CLI tools or direct API calls is prohibited.\r \r

  • All actions are dispatched through scripts/huawei-cloud.py with --action \x3Caction_name> and --params \x3Cjson_params>\r
  • All scripts and environment check scripts are inside the skill package. You must use skill action=exec to execute them; do not run them directly in a shell\r
  • For action names and parameters, see the Core Tools section below\r
  • Do not attempt hcloud, kubectl, curl IAM, or other CLI/API methods. This skill does not depend on these tools\r
  • All paths are relative to the skill directory, which is the directory where this SKILL.md resides\r \r

Overview\r

\r This skill turns "what changed before the incident" into provable causal attribution. It ingests audit logs, K8s historical events, AOM active+history alarms, and current resource topology snapshots; filters noise; maps core changes to blast radius; scores risk by sensitivity, topology scope, security boundary span, temporal proximity to fault, and event/alarm correlation; then outputs a complete Markdown report with investigation steps, core change timeline, evidence matrix, blast radius, Top N risk alerts, conclusion, and data gaps.\r \r This skill is applicable to the following scenarios:\r \r

  1. Incidents where recent workload releases, config updates, or network/security policy changes may be the cause\r
  2. CoreDNS, kube-proxy, or cluster plugin configuration changes causing business-wide failures\r
  3. Node taint, cordon/drain, node pool resize, or cluster upgrade triggering Pod Pending, Evicted, or NotReady\r
  4. NetworkPolicy/RBAC changes causing connection timeouts, 403 errors, DNS anomalies, or cross-namespace access failures\r
  5. Service/Ingress/Gateway route changes causing traffic routing failures\r
  6. Correlating audit trail changes with observed failures and alarm timelines\r \r This skill does NOT handle the following:\r \r
  7. Executing any remediation actions (rollback, scale, delete, drain, reboot, modify NetworkPolicy/RBAC/Security Group/VPC ACL)\r
  8. Making causal conclusions from object updates alone without temporal or response-signal correlation\r
  9. Creating, modifying, or deleting CCE resources\r
  10. Guessing or fabricating diagnosis results without evidence\r \r ---\r \r

Prerequisites\r

\r Before using, you must run the environment check script to complete environment validation and dependency installation in one step:\r \r

  • Linux / macOS: skill action=exec: bash skill://scripts/check_env.sh\r
  • Windows: skill action=exec: powershell -ExecutionPolicy Bypass -File skill://scripts/check_env.ps1\r \r

Windows Note: Do not use && to chain commands (PowerShell 5.x does not support it). Use semicolons if you need to change directories first.\r \r The script will check in sequence: Python >= 3.6 → install dependencies → validate SDK → validate credentials → validate service availability.\r If the environment check fails, fix the issues before continuing with other actions.\r \r Environment Variables:\r \r | Variable | Required | Description |\r |----------|----------|-------------|\r | HW_ACCESS_KEY | Yes | Huawei Cloud AK |\r | HW_SECRET_KEY | Yes | Huawei Cloud SK |\r | HW_REGION_NAME | No | Default cn-north-4 |\r | HW_PROJECT_ID | No | Project ID (automatically obtained via IAM API when not set) |\r | HW_SECURITY_TOKEN | No | Required when using temporary AK/SK |\r | HW_CLUSTER_ID | No | Default CCE cluster ID (can also be passed per action) |\r \r Security Constraints:\r \r

  1. Never persist credentials (AK/SK/Token/Certificate) to the filesystem\r
  2. AK/SK exist only within the current request call stack; released after use\r
  3. Only non-sensitive project IDs are cached in process memory (never written to disk)\r
  4. All temporary certificate files must be deleted immediately after use\r
  5. Never expose AK/SK in logs, responses, or error messages\r \r Do not output the values of the above environment variables.\r \r ---\r \r

IAM Permission Requirements\r

\r | API Action | Permission | Purpose |\r |-----------|------------|---------|\r | cce:cluster:get | Get cluster | View cluster details |\r | cce:cluster:list | List clusters | List CCE clusters |\r | cce:node:list | List nodes | List cluster nodes |\r | cce:nodepool:list | List node pools | List node pools |\r | aom:*:get | Read AOM | Query AOM metrics and alarms |\r | aom:alarmRule:list | List alarm rules | Query alarm rules |\r | aom:event:list | List events | Query AOM alarm events |\r \r Permission Failure Handling:\r

  1. When any command fails due to permission errors, display required permission list\r
  2. Guide the user to create a custom policy in the IAM console\r
  3. Pause execution and wait for user confirmation\r \r ---\r \r

Core Tools\r

\r All actions are dispatched through scripts/huawei-cloud.py using skill action=exec.\r \r

Primary Change Impact Analysis\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_change_impact_analyze | region, cluster_id | Primary comprehensive action: orchestrates audit log ingestion, K8s event correlation, AOM alarm correlation, resource snapshot collection, noise filtering, blast radius modeling, and risk scoring into a unified change impact report with Top N risk alerts |\r \r

Audit and Event Collection\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_query_cce_audit_logs | region, cluster_id | Query CCE Kubernetes audit logs for create/update/patch/delete operations with actor, verb, resource, namespace, name, requestURI, statusCode |\r | huawei_query_k8s_events_from_lts | region, cluster_id | Query historical K8s Events from LTS (overcomes the K8s API short event window) |\r | huawei_get_cce_events | region, cluster_id | List current Kubernetes Events when LTS is unavailable |\r \r

Alarm Correlation\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_analyze_aom_alarms | region, cluster_id | Analyze AOM active + history alarm patterns and correlation across resources |\r \r

Domain Drill-Down (Read-Only)\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_workload_rollout_diagnose | region, cluster_id, namespace, kind, name | Drill down when changes point to Deployment/StatefulSet/DaemonSet rollout failures (cross-skill: huawei-cloud-cce-workload-failure-diagnoser) |\r | huawei_network_failure_diagnose | region, cluster_id | Drill down when changes point to Service/Ingress/NetworkPolicy/ELB connectivity failures (cross-skill: huawei-cloud-cce-network-failure-diagnoser) |\r | huawei_node_failure_diagnose | region, cluster_id | Drill down when changes point to Node taint, NotReady, scheduling, or resource pressure (cross-skill: huawei-cloud-cce-node-failure-diagnoser) |\r \r

Current Topology Snapshots\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_get_cce_pods | region, cluster_id | List current Pod status for blast radius modeling |\r | huawei_get_cce_deployments | region, cluster_id | List current Deployment status |\r | huawei_get_cce_services | region, cluster_id | List current Service selector/ports for impact mapping |\r | huawei_get_cce_ingresses | region, cluster_id | List current Ingress rules/backends for impact mapping |\r | huawei_get_kubernetes_nodes | region, cluster_id | List current Node status for taint/impact mapping |\r | huawei_list_cce_configmaps | region, cluster_id | List current ConfigMap objects (identify CoreDNS, kube-proxy, business configs) |\r | huawei_list_cce_secrets | region, cluster_id | List current Secret objects |\r | huawei_list_cce_nodepools | region, cluster_id | List current NodePool status for infrastructure change context |\r \r

Cloud Network Snapshots\r

\r | Action | Required Parameters | Description |\r |--------|---------------------|-------------|\r | huawei_list_security_groups | region | List current Security Group rules for cloud network context |\r | huawei_list_vpc_acls | region | List current VPC ACL rules for cloud network context |\r \r ---\r \r

Parameter Reference\r

\r Common Parameters:\r \r | Parameter | Required | Description |\r |-----------|----------|-------------|\r | region | Yes | Huawei Cloud region, e.g., cn-north-4 |\r | cluster_id | Yes | CCE cluster ID |\r \r Optional Parameters (passed via --params JSON):\r \r | Parameter | Description |\r |-----------|-------------|\r | hours | Analysis window in hours (default 1) |\r | start_time | Analysis window start (YYYY-MM-DD HH:MM:SS), alternative to hours |\r | end_time | Analysis window end (YYYY-MM-DD HH:MM:SS), alternative to hours |\r | namespace | Narrow scope to a namespace, but do not exclude kube-system/CoreDNS global changes |\r | target_name | Target object name for scope narrowing |\r | workload_name | Workload name for scope narrowing |\r | app_name | Application name for scope narrowing |\r | fault_time | Incident time point for temporal proximity scoring |\r | incident_time | Alternative to fault_time |\r | log_group_id | Audit log group ID (manual fallback when auto-discovery fails) |\r | log_stream_id | Audit log stream ID (manual fallback when auto-discovery fails) |\r | include_audit | Enable/disable audit log collection (default true) |\r | include_k8s_events | Enable/disable K8s event collection (default true) |\r | include_aom | Enable/disable AOM alarm collection (default true) |\r | include_snapshots | Enable/disable resource snapshot collection (default true) |\r | top_n | Number of top risk alerts in report (default 3) |\r | output_file | Path to write the Markdown report file |\r | ak | Override AK (uses HW_ACCESS_KEY by default) |\r | sk | Override SK (uses HW_SECRET_KEY by default) |\r | project_id | Override project ID (auto-obtained via IAM when not set) |\r \r ---\r \r

Output Format\r

\r

Primary: huawei_change_impact_analyze\r

\r Returns structured JSON with embedded report_markdown. See references/output-schema.md for full schema.\r \r

{\r
  "success": true,\r
  "analysis_trace_id": "CIA-yyyymmddHHMMSS-xxxxxxxx",\r
  "analysis_window": {\r
    "start_time": "YYYY-MM-DD HH:MM:SS",\r
    "end_time": "YYYY-MM-DD HH:MM:SS",\r
    "hours": 1\r
  },\r
  "scope": {\r
    "region": "cn-north-4",\r
    "cluster_id": "cluster-id",\r
    "namespace": "optional",\r
    "target_name": "optional"\r
  },\r
  "summary": {\r
    "core_change_count": 3,\r
    "top_risk_count": 3,\r
    "data_sources": {\r
      "CCE Audit Logs": "success",\r
      "K8s Historical Events": "success",\r
      "AOM Alarms": "success",\r
      "Current Resource Snapshots": "success"\r
    }\r
  },\r
  "top_changes": [\r
    {\r
      "time": "YYYY-MM-DD HH:MM:SS",\r
      "verb": "patch",\r
      "resource": "configmaps",\r
      "namespace": "kube-system",\r
      "name": "coredns",\r
      "object_key": "kube-system/coredns",\r
      "category": "global_config_change",\r
      "title": "Cluster core configuration change",\r
      "actor": "user or serviceAccount",\r
      "semantic_fields": ["data", "Corefile"],\r
      "blast_radius": "cluster-wide",\r
      "impacted_entities": {\r
        "pods": [],\r
        "services": ["kube-system/kube-dns"],\r
        "ingresses": [],\r
        "nodes": ["node-a"]\r
      },\r
      "risk_score": 96,\r
      "risk_level": "Critical",\r
      "confidence": "high",\r
      "risk_reasons": [],\r
      "evidence": []\r
    }\r
  ],\r
  "changes": [],\r
  "report_markdown": "# CCE Change Impact Analysis Report\
...",\r
  "report_file": "/optional/path/report.md",\r
  "capture_metadata": {}\r
}\r
```\r
\r
---\r
\r
## Verification\r
\r
1. Run the environment check script to confirm dependencies and credentials are available\r
2. Use `huawei_change_impact_analyze` on a known stable cluster to verify it returns `success: true` with zero or low-confidence core changes\r
3. Use `huawei_change_impact_analyze` on a cluster with known recent changes to verify Top N risk alerts are accurately identified\r
4. Verify that noise filtering correctly excludes HPA replica-only updates, controller status writes, Lease/Token/status subresource writes\r
5. Verify that CoreDNS/kube-proxy/kube-system changes are always included regardless of namespace scope\r
6. Verify that blast radius mapping correctly traces Service selector → Pod → Ingress → Node propagation\r
7. Confirm that low-confidence conclusions are clearly labeled with data gaps\r
\r
---\r
\r
## Best Practices\r
\r
1. Always start with `huawei_change_impact_analyze` for comprehensive change correlation; drill down into domain diagnoser actions only when specific evidence requires deeper analysis\r
2. Find changes first, then map impact, then align with alarms/events/fault time — do not conclude root cause from object updates alone\r
3. CoreDNS, kube-proxy, network plugin, and Ingress controller config changes in `kube-system` must always be included in business fault analysis regardless of target namespace scope\r
4. Deployment HPA-only `replicas` adjustments are noise; image, startup args, probe, resource spec, environment variable, and ConfigMap/Secret reference changes are core changes\r
5. NetworkPolicy/RBAC changes must be correlated with connection timeouts, 403, DNS anomalies, and cross-namespace access failures\r
6. Node taint, cordon/drain, node pool resize, and cluster upgrade changes must be correlated with Pod Pending, Evicted, NotReady, and node events\r
7. All remediation actions must be output as recommendations only and handed off to `huawei-cloud-cce-auto-remediation-runner`\r
8. Clearly label low-confidence conclusions with required supplementary data; never present speculation as fact\r
\r
---\r
\r
## Reference Documents\r
\r
- Four-stage pipeline and risk scoring rules: `references/workflow.md`\r
- Reusable capabilities, gaps, and suggested atomic actions: `references/capability-map.md`\r
- Output field specification and Markdown template: `references/output-schema.md`\r
- Read-only boundaries and remediation handoff rules: `references/risk-rules.md`\r
- [Huawei Cloud CCE Documentation](https://support.huaweicloud.com/cce/index.html)\r
- [Huawei Cloud Python SDK Documentation](https://support.huaweicloud.com/api-cce/cce_02_0113.html)\r
\r
---\r
\r
## Notes\r
\r
1. This skill is read-only analysis and report generation only; no modification of workloads, rollback, ConfigMap/Secret changes, Security Group/ACL/NetworkPolicy/RBAC adjustments, or node cordon/drain/reboot operations\r
2. Do not output the values of HW_ACCESS_KEY, HW_SECRET_KEY, HW_SECURITY_TOKEN, or other environment variables\r
3. All scripts must be executed via `skill action=exec`; do not run them directly in a shell\r
4. Any remediation action must be handed off to `huawei-cloud-cce-auto-remediation-runner`; this skill never executes remediation\r
5. The environment check script must be run before any analysis action\r
6. When using temporary AK/SK, HW_SECURITY_TOKEN must be set\r
7. Cross-skill references: remediation → `huawei-cloud-cce-auto-remediation-runner`; comprehensive root cause → `huawei-cloud-cce-root-cause-analyzer`; workload diagnosis → `huawei-cloud-cce-workload-failure-diagnoser`; network diagnosis → `huawei-cloud-cce-network-failure-diagnoser`; node diagnosis → `huawei-cloud-cce-node-failure-diagnoser`\r
\r
---\r
\r
## Common Pitfalls\r
\r
1. **Concluding root cause from object updates alone** — Always require temporal proximity, event/alarm response, and topology impact evidence; an object update without correlation is insufficient evidence\r
2. **Excluding kube-system changes when scoped to a namespace** — CoreDNS, kube-proxy, and cluster plugin changes are global even when the target namespace is different; always include them\r
3. **Treating HPA replica updates as core changes** — HPA-only `replicas` modifications are noise; only image, probe, resource, env, config reference changes are core\r
4. **Not correlating NetworkPolicy/RBAC with connectivity symptoms** — NetworkPolicy/RBAC changes must be cross-referenced with connection timeout, 403, DNS anomaly, and cross-namespace access failure events\r
5. **Attempting remediation actions from this skill** — All changes must be handed off to `huawei-cloud-cce-auto-remediation-runner`; this skill only outputs recommendations\r
6. **Failing to label low-confidence conclusions** — When evidence is insufficient, write "insufficient evidence" explicitly with data gaps; never present guesses as conclusions\r
7. **Ignoring controller and platform noise** — Lease, Token, status subresource, Node status patch, scheduler binding, and CCE platform-managed RBAC updates must all be filtered out; they are control-plane closed-loop operations, not user changes\r
8. **Not building a fault timeline** — Establish user-perceived fault time, alarm trigger time, Kubernetes event time, and change time before scoring risk
安全使用建议
Install only after reviewing the full action surface and use tightly scoped read-only Huawei Cloud credentials if possible. Do not grant production admin credentials unless you intentionally want the bundled administrative and remediation actions available, and avoid invoking unlisted actions through the dispatcher.
能力标签
requires-walletrequires-sensitive-credentials
能力评估
Purpose & Capability
The stated purpose and references repeatedly define a read-only analyzer, yet the included dispatcher registers cluster/node/workload deletion, scaling, addon install/uninstall/update, CCE/CCI provisioning, node cordon/drain, ECS operations, HSS status changes, AOM rule mutation, and workload rollback actions.
Instruction Scope
SKILL.md and risk-rules.md prohibit remediation and resource modification, while scripts/huawei_cloud/dispatcher.py makes those actions callable through the same scripts/huawei-cloud.py entry point that the skill instructs agents to use.
Install Mechanism
No hidden installer or malware-like install behavior was found, and VirusTotal telemetry is clean; however, the documented check_env scripts referenced for dependency installation are not present in the artifact file list.
Credentials
Huawei Cloud AK/SK credentials and CCE access are coherent for diagnostics, but the packaged action surface goes beyond read-only analysis and includes high-impact infrastructure administration and access to pod logs, Secret metadata, and optionally Secret data.
Persistence & Privilege
There is no evidence of background persistence, but the code can generate and return kubeconfig material, create temporary client certificate/key files, and write reports to caller-supplied paths; these privileges are high-impact for an analysis-only skill.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install huawei-cloud-cce-change-impact-analyzer
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /huawei-cloud-cce-change-impact-analyzer 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.0
Initial release
元数据
Slug huawei-cloud-cce-change-impact-analyzer
版本 0.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Huawei Cloud Cce Change Impact Analyzer 是什么?

Huawei Cloud CCE change impact analysis skill that converts "what changed before the incident" into provable causal attribution. Use this skill when a CCE in... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 24 次。

如何安装 Huawei Cloud Cce Change Impact Analyzer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install huawei-cloud-cce-change-impact-analyzer」即可一键安装,无需额外配置。

Huawei Cloud Cce Change Impact Analyzer 是免费的吗?

是的,Huawei Cloud Cce Change Impact Analyzer 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Huawei Cloud Cce Change Impact Analyzer 支持哪些平台?

Huawei Cloud Cce Change Impact Analyzer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Huawei Cloud Cce Change Impact Analyzer?

由 shijingcheng(@pintudeyudi)开发并维护,当前版本 v0.1.0。

💬 留言讨论