Chapter 7

Data Transform Nodes

Ch07 Data Transform Nodes

As data flows between nodes, it often needs reshaping, filtering, computing, and merging. n8n provides a complete suite of data transformation nodes — from simple field additions to complex JavaScript/Python logic. This chapter uses a real-world e-commerce order cleaning scenario to walk through all the key data processing nodes together.

Data Transform Nodes at a Glance

Node	Primary Use	Typical Scenario
Set	Add/modify/delete fields	Tag data, modify field values
Edit Fields	Bulk field mapping and renaming	Rename API response fields to business fields
Code	JS/Python custom logic	Complex calculations, string processing, business rules
Filter	Conditionally filter items	Keep only orders with status "completed"
Split In Batches	Process items in batches	Write large datasets to DB; rate-limit API calls
Aggregate	Merge multiple items into one	Sum totals, pack a list for a single notification
Merge	Combine two input streams	Join data from two sources
Sort	Sort items by field	Sort orders by amount descending
Limit	Cap output item count	Debug mode: process only N items

Set Node: Add/Modify/Delete Fields

Set is n8n's most-used data processing node — add new fields, modify existing values, or remove unwanted fields from items.

Three operating modes:

Keep All Fields (default): Preserve all original fields; add/modify specified ones
Replace All Fields: Discard all original fields; keep only fields you explicitly add (useful for stripping noise)
Delete Fields: Remove only the specified fields; keep everything else

Set vs Edit Fields: Use Set to add computed fields or change a few values; use Edit Fields for bulk renaming or large-scale field mapping. In modern n8n, Edit Fields replaces the older Rename Keys node.

Code Node: JavaScript and Python Custom Logic

The Code node is n8n's most powerful data tool — write any JavaScript or Python you need, executed directly in the workflow without an external server.

JavaScript Mode (Recommended)

// Pattern 1: Transform each item
return items.map(item => {
  const order = item.json;

  const discountRate = order.tier === 'gold' ? 0.85 : 0.95;
  const finalPrice = (order.originalPrice * discountRate).toFixed(2);

  const orderDate = new Date(order.createdAt);
  const formattedDate = orderDate.toLocaleDateString('en-US');

  return {
    json: {
      ...order,
      finalPrice,
      formattedDate,
      isHighValue: parseFloat(finalPrice) > 1000,
    }
  };
});

// Pattern 2: Collapse all items into a single summary item
const summary = {
  orders: items.map(i => i.json),
  totalAmount: items.reduce((sum, i) => sum + i.json.amount, 0),
  count: items.length,
};
return [{ json: summary }];

Python Mode

# Python mode: statistical analysis
orders = [item['json'] for item in _input.all()]

status_counts = {}
for order in orders:
    status = order.get('status', 'unknown')
    status_counts[status] = status_counts.get(status, 0) + 1

amounts = [order.get('amount', 0) for order in orders]
avg_amount = sum(amounts) / len(amounts) if amounts else 0

return [{
  'json': {
    'statusBreakdown': status_counts,
    'avgAmount': round(avg_amount, 2),
    'totalOrders': len(orders),
  }
}]

Split In Batches: Handle Large Datasets

When processing thousands of records, Split In Batches divides items into chunks so you can process them incrementally without memory pressure or API rate limit violations.

Usage pattern: Split In Batches has two output ports: Loop (current batch → your processing node → loops back) and Done (fires after all batches complete → post-processing). The Loop port feeds back into Split In Batches automatically, creating the batch iteration.

Aggregate Node: Merge Multiple Items Into One

Aggregate is the reverse of item-per-item processing: it collapses multiple items into a single item. Two modes:

Append All Items to Array: Collect all items' json into a single array field
Aggregate Individual Fields: Run aggregation operations (sum, count, max/min) on specific fields

Real-World Example: Clean E-Commerce Order Data

Complete Code node that cleans raw Shopify API orders into an internal CRM format:

// Input: raw Shopify API order array
// Output: cleaned data matching internal CRM schema

const cleanOrders = items
  // 1. Filter: only fulfilled, paid orders
  .filter(item => {
    const o = item.json;
    return o.fulfillment_status === 'fulfilled' && o.financial_status === 'paid';
  })
  // 2. Transform: rename fields, convert types, compute derived fields
  .map(item => {
    const o = item.json;
    const customer = o.customer || {};
    const shippingAddr = o.shipping_address || {};

    const totalQuantity = (o.line_items || [])
      .reduce((sum, li) => sum + (li.quantity || 0), 0);

    const totalPrice = parseFloat(o.total_price || 0);

    return {
      json: {
        orderId:        o.id?.toString(),
        orderNo:        o.order_number?.toString(),
        orderDate:      o.created_at ? o.created_at.substring(0, 10) : '',
        customerName:   `${customer.first_name || ''} ${customer.last_name || ''}`.trim(),
        customerEmail:  customer.email || '',
        totalAmount:    totalPrice,
        currency:       o.currency || 'USD',
        itemCount:      totalQuantity,
        shippingCity:   shippingAddr.city || '',
        isHighValue:    totalPrice >= 500,
        source:         'shopify',
        syncedAt:       new Date().toISOString(),
      }
    };
  })
  // 3. Deduplicate by orderId
  .filter((item, index, arr) =>
    arr.findIndex(a => a.json.orderId === item.json.orderId) === index
  );

return cleanOrders;

Data quality validation before database insertion:

const validItems = [];
const staticData = $getWorkflowStaticData('global');
const rejected = [];

for (const item of items) {
  const o = item.json;
  const errors = [];

  if (!o.orderId) errors.push('Missing orderId');
  if (!o.customerEmail?.includes('@')) errors.push('Invalid email');
  if (typeof o.totalAmount !== 'number' || o.totalAmount < 0) errors.push('Invalid amount');

  if (errors.length === 0) {
    validItems.push(item);
  } else {
    rejected.push({ ...o, _errors: errors });
  }
}

// Store rejected items in static data for later review
staticData.lastRejected = rejected;

return validItems;

Best practice: For data cleaning workflows, do all three steps — filter → transform → validate — inside a single Code node rather than spreading them across multiple Filter and Set nodes. One consolidated Code node is easier to maintain, clearer in logic, and faster to execute.

Rate this chapter

4.6 / 5 (46 ratings)