Spark Engineer

Name: Spark Engineer
Author: veeramanikandanr48

Description

Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.

Usage Guidance

This skill is an offline reference and looks internally consistent with its Spark-focused purpose. Before running any provided code in your environment: 1) review and supply only the credentials your cluster/storage requires (the skill does not request any itself), 2) avoid running example collect() or large broadcasts on production data without safeguards, and 3) inspect any mapPartitions/foreachPartition code that opens external DB/HTTP connections to ensure it uses approved endpoints and secure credentials. If you plan to let an agent execute code from this skill automatically, ensure the agent does not have unrestricted access to production cluster credentials or sensitive storage buckets.

Capability Analysis

Type: OpenClaw Skill Name: spark-engineer Version: 0.1.0 The OpenClaw AgentSkills skill bundle for 'spark-engineer' is benign. The `SKILL.md` provides instructions for an AI agent to act as a Spark engineer, focusing on best practices and technical guidance. All referenced markdown files (`references/*.md`) contain documentation and code examples (PySpark/Scala) for Apache Spark operations, performance tuning, and structured streaming. There is no evidence of data exfiltration, malicious execution, persistence mechanisms, prompt injection attempts to subvert the agent's role, or obfuscation. Generic placeholders for S3 buckets, Kafka brokers, and database connection strings are used for illustrative purposes and do not indicate malicious intent.

Capability Assessment

✓ Purpose & Capability

Name/description match the content: all required files and instructions are Spark-focused (DataFrame API, RDDs, partitioning, tuning, streaming). No unrelated binaries, environment variables, or external services are declared as required.

✓ Instruction Scope

SKILL.md and reference files contain only Spark code examples, configuration recommendations, and monitoring guidance. They reference typical cluster endpoints and storage (S3, HDFS, Kafka) as examples for normal Spark usage, but do not instruct the agent to read local system secrets/configuration or to exfiltrate data to unexpected endpoints.

✓ Install Mechanism

No install spec or code files with executable install steps are present — this is instruction-only, so nothing is downloaded or written to disk by the skill itself.

✓ Credentials

The skill declares no required environment variables or credentials. Example snippets show connecting to typical data systems (S3, Kafka, HDFS) which would need credentials when actually run, but the skill itself does not request or embed secrets.

✓ Persistence & Privilege

Skill is not always-included, does not request persistent privileges, and is user-invocable only. There is no behavior that modifies other skills or global agent settings.

Version History

v0.1.0

Initial release of spark-engineer skill. - Provides expert support for building and optimizing Apache Spark applications, ETL pipelines, and streaming analytics. - Covers workflows for requirement analysis, pipeline design, implementation, optimization, and validation. - Includes reference guides for DataFrame API, Spark SQL, RDD operations, partitioning, caching, performance tuning, and streaming. - Lists critical best practices and anti-patterns for production Spark workloads. - Supplies structured output templates including code, configurations, partitioning strategies, performance analysis, and monitoring advice.

Metadata

Slug spark-engineer

Version 0.1.0

License —

All-time Installs 4

Active Installs 4

Total Versions 1

Frequently Asked Questions

What is Spark Engineer?

Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics. It is an AI Agent Skill for Claude Code / OpenClaw, with 1907 downloads so far.

How do I install Spark Engineer?

Run "/install spark-engineer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Spark Engineer free?

Yes, Spark Engineer is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Spark Engineer support?

Spark Engineer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Spark Engineer?

It is built and maintained by Veera (@veeramanikandanr48); the current version is v0.1.0.

More Skills

What is Spark Engineer?

How do I install Spark Engineer?

Is Spark Engineer free?

Which platforms does Spark Engineer support?

Who created Spark Engineer?

💬 Comments