← Back to Skills Marketplace
sdk-team

Alibabacloud Dataworks Data Governance

by alibabacloud-skills-team · GitHub ↗ · v0.0.1-beta.1 · MIT-0
cross-platform ⚠ suspicious
100
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install alibabacloud-dataworks-data-governance
Description
Alibaba Cloud DataWorks Data Governance Tag Management Skill. Use for managing data asset tags: creating, updating, querying tag keys/values, binding/unbindi...
README (SKILL.md)

DataWorks Data Governance Tag Management

Manage data asset tags via the DataWorks Data Governance API, including creating, updating, and querying tag keys/values, as well as batch binding and unbinding tags on data assets.

FORBIDDEN API — DeleteDataAssetTag must NOT be used in this skill. Calling DeleteDataAssetTag is strictly prohibited. Do not implement, suggest, or invoke this API under any circumstances. If a user requests deletion of a tag key or tag value, inform them that this operation is outside the scope of this skill and must be handled through other authorized channels.

Architecture: DataWorks Data Governance Tag API (ROA style) + Alibaba Cloud Python Common SDK


Pre-check: Aliyun CLI >= 3.3.1 required Run aliyun version to verify >= 3.3.1. If not installed or version too low, see references/cli-installation-guide.md for installation instructions. Then [MUST] run aliyun configure set --auto-plugin-install true to enable automatic plugin installation.


Pre-check: Alibaba Cloud Credentials Required

Security Rules:

  • NEVER read, echo, or print AK/SK values (e.g., echo $ALIBABA_CLOUD_ACCESS_KEY_ID is FORBIDDEN)
  • NEVER ask the user to input AK/SK directly in the conversation or command line
  • NEVER use aliyun configure set with literal credential values
  • ONLY use aliyun configure list to check credential status
aliyun configure list

Check the output for a valid profile (AK, STS, or OAuth identity).

If no valid profile exists, STOP here.

  1. Obtain credentials from Alibaba Cloud Console
  2. Configure credentials outside of this session (via aliyun configure in terminal or environment variables in shell profile)
  3. Return and re-run after aliyun configure list shows a valid profile

Installation

pip3 install alibabacloud_tea_openapi==0.4.4 alibabacloud_credentials==1.0.8 alibabacloud_tea_util==0.3.14 alibabacloud_openapi_util==0.2.4

RAM Permissions

Product RAM Action Resource Scope Description
DataWorks dataworks:CreateDataAssetTag * Create a data asset tag key
DataWorks dataworks:UpdateDataAssetTag * Update a data asset tag key
DataWorks dataworks:ListDataAssetTags * Query the data asset tag list
DataWorks dataworks:TagDataAssets * Bind tags to data assets
DataWorks dataworks:UnTagDataAssets * Unbind tags from data assets
DataWorks dataworks:ListDataAssets * Query the data asset list

[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:

  1. Read references/ram-policies.md to get the full list of permissions required by this SKILL
  2. Use ram-permission-diagnose skill to guide the user through requesting the necessary permissions
  3. Pause and wait until the user confirms that the required permissions have been granted

Parameter Confirmation

IMPORTANT: Parameter Confirmation — Before executing any command or API call, ALL user-customizable parameters (e.g., RegionId, instance names, CIDR blocks, passwords, domain names, resource specifications, etc.) MUST be confirmed with the user. Do NOT assume or use default values without explicit user approval.

Parameter Required/Optional Description Default
region_id Required Alibaba Cloud Region ID, e.g. cn-hangzhou None — confirm with user
tag_key Required (create/update) Tag key name None — confirm with user
tag_values Optional List of tag values []
description Optional Description of the tag key None
category Optional Tag category: Normal or CUSTOM Normal
managers Optional List of tag manager user IDs []
data_asset_ids Required (bind/unbind) List of data asset identifiers, max 100 None — confirm with user
data_asset_type Required (bind/unbind) Asset type, e.g. ACS::DataWorks::Table None — confirm with user
project_id Conditionally required Project space ID (required when DataAssetType is ACS::DataWorks::Task) None
env_type Conditionally required Project environment — required when project_id is set: Prod / Dev None
asset_type Required (ListDataAssets) Asset object type: Table, Task, Node, WorkFlow, DataServiceApi, DataQualityRule None — confirm with user
asset_tags Optional (ListDataAssets) Filter by tag key-value list, max 20, e.g. [{"Key":"k1","Value":"v1"}] None
asset_name Optional (ListDataAssets) Asset name keyword for fuzzy search None
asset_ids Optional (ListDataAssets) List of asset module IDs None
asset_owner Optional (ListDataAssets) Asset owner user ID None
sort_by Optional (ListDataAssets) Sort field and direction, e.g. CreateTime Desc CreateTime Desc

Core Workflow

Initialize SDK Client

import re
from alibabacloud_tea_openapi.client import Client as OpenApiClient
from alibabacloud_credentials.client import Client as CredentialClient
from alibabacloud_tea_openapi import models as open_api_models
from alibabacloud_tea_util import models as util_models
from alibabacloud_openapi_util.client import Client as OpenApiUtilClient
import json

_VALID_DATA_ASSET_TYPES = {
    'ACS::DataWorks::Table', 'ACS::DataWorks::Task',
    'ACS::DataWorks::Node', 'ACS::DataWorks::WorkFlow',
    'ACS::DataWorks::DataServiceApi', 'ACS::DataWorks::DataQualityRule'
}
_VALID_ASSET_TYPES = {'Table', 'Task', 'Node', 'WorkFlow', 'DataServiceApi', 'DataQualityRule'}
_VALID_CATEGORIES  = {'Normal', 'CUSTOM'}
_VALID_ENV_TYPES   = {'Prod', 'Dev'}

def _validate_region_id(region_id):
    if not isinstance(region_id, str) or not region_id:
        raise ValueError('region_id must be a non-empty string')
    if not re.match(r'^[a-z]{2}-[a-z0-9-]+$', region_id):
        raise ValueError(f'Invalid region_id format: {region_id!r}')

def _validate_tags(tags, max_len=20):
    if not isinstance(tags, list) or len(tags) == 0:
        raise ValueError('tags must be a non-empty list')
    if len(tags) > max_len:
        raise ValueError(f'tags exceeds max length {max_len}, got {len(tags)}')
    for t in tags:
        if not isinstance(t, dict) or 'Key' not in t or 'Value' not in t:
            raise ValueError('Each tag must be a dict with "Key" and "Value"')

def _validate_asset_ids(ids, max_len=100):
    if not isinstance(ids, list) or len(ids) == 0:
        raise ValueError('data_asset_ids must be a non-empty list')
    if len(ids) > max_len:
        raise ValueError(f'data_asset_ids exceeds max length {max_len}, got {len(ids)}')

def _validate_page(page_number, page_size):
    if not isinstance(page_number, int) or page_number \x3C 1:
        raise ValueError(f'page_number must be an integer >= 1, got {page_number!r}')
    if not isinstance(page_size, int) or not (1 \x3C= page_size \x3C= 100):
        raise ValueError(f'page_size must be an integer between 1 and 100, got {page_size!r}')

def create_client(region_id: str) -> OpenApiClient:
    _validate_region_id(region_id)
    credential = CredentialClient()
    config = open_api_models.Config(credential=credential)
    config.endpoint = f'dataworks.{region_id}.aliyuncs.com'
    config.user_agent = 'AlibabaCloud-Agent-Skills'
    return OpenApiClient(config)

1. Create Tag Key (CreateDataAssetTag)

def create_data_asset_tag(client, key: str, values=None, description=None,
                           value_policy=None, category='Normal', managers=None):
    if not isinstance(key, str) or not key.strip():
        raise ValueError('key must be a non-empty string')
    if category not in _VALID_CATEGORIES:
        raise ValueError(f'category must be one of {_VALID_CATEGORIES}, got {category!r}')
    if values is not None:
        if not isinstance(values, list):
            raise ValueError('values must be a list')
    if managers is not None:
        if not isinstance(managers, list):
            raise ValueError('managers must be a list')

    params = open_api_models.Params(
        action='CreateDataAssetTag',
        version='2024-05-18',
        protocol='HTTPS',
        method='POST',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/tags',
        req_body_type='json',
        body_type='json'
    )
    body = {'Key': key}
    if values is not None:
        body['Values'] = values
    if description is not None:
        body['Description'] = description
    if value_policy is not None:
        body['ValuePolicy'] = value_policy
    if category:
        body['Category'] = category
    if managers is not None:
        body['Managers'] = managers

    request = open_api_models.OpenApiRequest(body=body)
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

2. Update Tag Key (UpdateDataAssetTag)

def update_data_asset_tag(client, key: str, values=None, description=None, managers=None):
    if not isinstance(key, str) or not key.strip():
        raise ValueError('key must be a non-empty string')
    if values is not None:
        if not isinstance(values, list):
            raise ValueError('values must be a list')
    if managers is not None:
        if not isinstance(managers, list):
            raise ValueError('managers must be a list')

    params = open_api_models.Params(
        action='UpdateDataAssetTag',
        version='2024-05-18',
        protocol='HTTPS',
        method='PUT',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/tags',
        req_body_type='json',
        body_type='json'
    )
    body = {'Key': key}
    if values is not None:
        body['Values'] = values
    if description is not None:
        body['Description'] = description
    if managers is not None:
        body['Managers'] = managers

    request = open_api_models.OpenApiRequest(body=body)
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

3. List Tag Keys (ListDataAssetTags)

def list_data_asset_tags(client, key=None, category=None, page_number=1, page_size=10):
    _validate_page(page_number, page_size)
    if category is not None and category not in _VALID_CATEGORIES:
        raise ValueError(f'category must be one of {_VALID_CATEGORIES}, got {category!r}')

    params = open_api_models.Params(
        action='ListDataAssetTags',
        version='2024-05-18',
        protocol='HTTPS',
        method='GET',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/tags',
        req_body_type='json',
        body_type='json'
    )
    queries = {
        'PageNumber': page_number,
        'PageSize': page_size
    }
    if key:
        queries['Key'] = key
    if category:
        queries['Category'] = category

    request = open_api_models.OpenApiRequest(
        query=OpenApiUtilClient.query(queries)
    )
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

4. Bind Tags to Data Assets (TagDataAssets)

def tag_data_assets(client, tags: list, data_asset_ids: list, data_asset_type: str,
                    project_id=None, env_type=None, auto_trace_enable=False):
    _validate_tags(tags, max_len=20)
    _validate_asset_ids(data_asset_ids, max_len=100)
    if data_asset_type not in _VALID_DATA_ASSET_TYPES:
        raise ValueError(f'data_asset_type must be one of {_VALID_DATA_ASSET_TYPES}, got {data_asset_type!r}')
    if data_asset_type == 'ACS::DataWorks::Task':
        if project_id is None or env_type is None:
            raise ValueError('project_id and env_type are required when data_asset_type is ACS::DataWorks::Task')
    if env_type is not None and env_type not in _VALID_ENV_TYPES:
        raise ValueError(f'env_type must be one of {_VALID_ENV_TYPES}, got {env_type!r}')

    params = open_api_models.Params(
        action='TagDataAssets',
        version='2024-05-18',
        protocol='HTTPS',
        method='POST',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/tags/bind',
        req_body_type='json',
        body_type='json'
    )
    body = {
        'Tags': tags,
        'DataAssetIds': data_asset_ids,
        'DataAssetType': data_asset_type,
        'AutoTraceEnable': auto_trace_enable
    }
    if project_id is not None:
        body['ProjectId'] = project_id
    if env_type is not None:
        body['EnvType'] = env_type

    request = open_api_models.OpenApiRequest(body=body)
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

5. List Data Assets (ListDataAssets)

def list_data_assets(client, asset_type: str, tags=None, name=None, ids=None,
                     owner=None, project_id=None, sort_by=None,
                     page_number=1, page_size=10):
    if asset_type not in _VALID_ASSET_TYPES:
        raise ValueError(f'asset_type must be one of {_VALID_ASSET_TYPES}, got {asset_type!r}')
    _validate_page(page_number, page_size)
    if tags is not None:
        _validate_tags(tags, max_len=20)

    params = open_api_models.Params(
        action='ListDataAssets',
        version='2024-05-18',
        protocol='HTTPS',
        method='GET',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/assets',
        req_body_type='json',
        body_type='json'
    )
    queries = {
        'Type': asset_type,
        'PageNumber': page_number,
        'PageSize': page_size
    }
    if tags is not None:
        queries['Tags'] = json.dumps(tags)
    if name is not None:
        queries['Name'] = name
    if ids is not None:
        queries['Ids'] = json.dumps(ids)
    if owner is not None:
        queries['Owner'] = owner
    if project_id is not None:
        queries['ProjectId'] = project_id
    if sort_by is not None:
        queries['SortBy'] = sort_by

    request = open_api_models.OpenApiRequest(
        query=OpenApiUtilClient.query(queries)
    )
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

6. Unbind Tags from Data Assets (UnTagDataAssets)

def untag_data_assets(client, tags: list, data_asset_ids: list, data_asset_type: str,
                      project_id=None, env_type=None):
    _validate_tags(tags, max_len=20)
    _validate_asset_ids(data_asset_ids, max_len=100)
    if data_asset_type not in _VALID_DATA_ASSET_TYPES:
        raise ValueError(f'data_asset_type must be one of {_VALID_DATA_ASSET_TYPES}, got {data_asset_type!r}')
    if data_asset_type == 'ACS::DataWorks::Task':
        if project_id is None or env_type is None:
            raise ValueError('project_id and env_type are required when data_asset_type is ACS::DataWorks::Task')
    if env_type is not None and env_type not in _VALID_ENV_TYPES:
        raise ValueError(f'env_type must be one of {_VALID_ENV_TYPES}, got {env_type!r}')

    params = open_api_models.Params(
        action='UnTagDataAssets',
        version='2024-05-18',
        protocol='HTTPS',
        method='POST',
        auth_type='AK',
        style='ROA',
        pathname='/api/v1/data-governance/tags/unbind',
        req_body_type='json',
        body_type='json'
    )
    body = {
        'Tags': tags,
        'DataAssetIds': data_asset_ids,
        'DataAssetType': data_asset_type
    }
    if project_id is not None:
        body['ProjectId'] = project_id
    if env_type is not None:
        body['EnvType'] = env_type

    request = open_api_models.OpenApiRequest(body=body)
    runtime = util_models.RuntimeOptions()
    runtime.connect_timeout = 5000   # ms
    runtime.read_timeout = 30000     # ms
    return client.call_api(params, request, runtime)

Response Handling

def handle_response(response):
    status_code = response.get('statusCode')
    body = response.get('body', {})
    if status_code == 200 and body.get('Data'):
        print(f"[SUCCESS] RequestId: {body.get('RequestId')}")
        return body.get('Data')
    else:
        code = body.get('Code')
        message = body.get('Message')
        raise Exception(f"[FAIL] Code={code}, Message={message}, RequestId={body.get('RequestId')}")

Success Verification

Operation success criteria:

  • HTTP status code 200
  • Response body Data field is true (create/update/bind/unbind operations)
  • Response body Data is an object (ListDataAssetTags, ListDataAssets)

For detailed step-by-step verification commands, see references/verification-method.md.


Cleanup

client = create_client('cn-hangzhou')  # Replace with actual Region

# Unbind tags from data assets
untag_data_assets(client,
    tags=[{"Key": "k1", "Value": "v1"}],
    data_asset_ids=["maxcompute-table.project.tableName"],
    data_asset_type="ACS::DataWorks::Table"
)

Note: Deletion of tag keys or tag values via DeleteDataAssetTag is not supported in this skill. Do not attempt to call this API.


Best Practices

  1. Batch limits: Tags max 20, DataAssetIds max 100 per request — split into batches if needed
  2. Task-type assets: When DataAssetType is ACS::DataWorks::Task, both ProjectId and EnvType are required
  3. Value policy: Use ValuePolicy (e.g. regex ^L[1-7]$) when creating a tag key to enforce valid value formats
  4. Least privilege: Create a dedicated RAM role for tag management with only the required permissions
  5. Throttling retry: Use exponential backoff when encountering Throttling errors
  6. Pagination: ListDataAssetTags and ListDataAssets default to 10 per page, max 100 — use TotalCount to calculate total pages
  7. ListDataAssets serialization: Tags and Ids query parameters must be JSON-serialized strings (use json.dumps())
  8. ListDataAssets sort: Default sort is CreateTime Desc; also supports ModifyTime and HealthScore

Reference Documents

Document Description
cli-installation-guide.md Aliyun CLI installation and configuration guide
ram-policies.md RAM permission policies required by this Skill
related-commands.md API command reference
verification-method.md Step-by-step success verification
acceptance-criteria.md Acceptance criteria with correct/incorrect examples
Usage Guidance
What to consider before installing: - The skill requires active Alibaba Cloud credentials (AK/SK, STS, or instance role) but the registry metadata does not declare this; expect to provide credentials via your normal aliCloud config or environment variables outside the chat session. Do NOT paste keys into the conversation. - Review and confirm the pip install command and package versions; run installs in a controlled environment (virtualenv/container) if possible. - The SKILL.md recommends enabling 'aliyun configure set --auto-plugin-install true' — this modifies your Aliyun CLI behavior (it will auto-download plugins). Only enable if you trust the CLI source and understand the implications. - Follow least-privilege guidance in references/ram-policies.md: create a RAM user or role scoped to only the permissions listed; avoid using root credentials. - Because the skill source is unknown, test in a non-production account or with a minimally privileged RAM user first, and verify the behavior manually (or inspect any code produced by the agent) before granting broader permissions. - If the registry metadata is updated to explicitly declare required env vars/primary credential or an authoritative source/homepage is provided, confidence in this assessment would increase.
Capability Tags
requires-walletrequires-oauth-tokenrequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The SKILL.md and reference files consistently describe DataWorks Data Governance tag operations and show use of the Alibaba Cloud Python SDK and CLI — these are appropriate for the stated purpose. However the published metadata does not declare that cloud credentials are required (the skill explicitly requires Alibaba Cloud credentials in its runtime instructions).
Instruction Scope
Instructions are narrowly focused on DataWorks tag management (create/update/list/tag/untag). They require confirming all user parameters before acting and forbid calling the DeleteDataAssetTag API. They also instruct running 'aliyun version' and 'aliyun configure list' and performing pip installs. No instructions request arbitrary file reads or secret exfiltration, but they do ask to change CLI settings (see persistence) and assume presence of credentials/config on the host.
Install Mechanism
There is no formal install spec in registry metadata (instruction-only). SKILL.md asks the user to run pip3 install for several Alibaba SDK packages and includes official aliyuncli download URLs. Using pip and official aliyuncli downloads is expected for this functionality but because the package is from an unknown skill source, users should verify package names/versions before running installs in production.
Credentials
The skill clearly requires Alibaba Cloud credentials and suggests environment/config-based authentication, yet the registry metadata lists no required env vars or primary credential. That mismatch reduces transparency: a user installing this skill may not realize credentials are necessary. The skill's credential needs themselves are proportional to the task (DataWorks APIs require them), but the metadata omission is a meaningful coherence issue.
Persistence & Privilege
The skill is instruction-only and does not request persistent presence or elevated platform privileges. It does recommend running 'aliyun configure set --auto-plugin-install true' which changes the CLI configuration and will allow the CLI to auto-install plugins; that is a benign convenience for using the CLI but is a configuration change the user should understand before applying. The skill is not marked always:true and does not modify other skills.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install alibabacloud-dataworks-data-governance
  3. After installation, invoke the skill by name or use /alibabacloud-dataworks-data-governance
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.0.1-beta.1
alibabacloud-dataworks-data-governance 0.0.1-beta.1 - Initial beta release for Alibaba Cloud DataWorks Data Governance Tag Management. - Supports creating, updating, listing, and querying tag keys/values. - Supports binding/unbinding tags to/from data assets, and querying data assets with flexible filters. - Enforces strict confirmation for all user-customizable parameters and credential checks. - Includes clear pre-checks for CLI version, credentials, and RAM permissions. - Explicitly prohibits usage of any tag deletion API (`DeleteDataAssetTag`).
Metadata
Slug alibabacloud-dataworks-data-governance
Version 0.0.1-beta.1
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Alibabacloud Dataworks Data Governance?

Alibaba Cloud DataWorks Data Governance Tag Management Skill. Use for managing data asset tags: creating, updating, querying tag keys/values, binding/unbindi... It is an AI Agent Skill for Claude Code / OpenClaw, with 100 downloads so far.

How do I install Alibabacloud Dataworks Data Governance?

Run "/install alibabacloud-dataworks-data-governance" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Alibabacloud Dataworks Data Governance free?

Yes, Alibabacloud Dataworks Data Governance is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Alibabacloud Dataworks Data Governance support?

Alibabacloud Dataworks Data Governance is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Alibabacloud Dataworks Data Governance?

It is built and maintained by alibabacloud-skills-team (@sdk-team); the current version is v0.0.1-beta.1.

💬 Comments