Data Transform Nodes
Ch07 Data Transform Nodes
As data flows between nodes, it often needs reshaping, filtering, computing, and merging. n8n provides a complete suite of data transformation nodes — from simple field additions to complex JavaScript/Python logic. This chapter uses a real-world e-commerce order cleaning scenario to walk through all the key data processing nodes together.
Data Transform Nodes at a Glance
| Node | Primary Use | Typical Scenario |
|---|---|---|
| Set | Add/modify/delete fields | Tag data, modify field values |
| Edit Fields | Bulk field mapping and renaming | Rename API response fields to business fields |
| Code | JS/Python custom logic | Complex calculations, string processing, business rules |
| Filter | Conditionally filter items | Keep only orders with status "completed" |
| Split In Batches | Process items in batches | Write large datasets to DB; rate-limit API calls |
| Aggregate | Merge multiple items into one | Sum totals, pack a list for a single notification |
| Merge | Combine two input streams | Join data from two sources |
| Sort | Sort items by field | Sort orders by amount descending |
| Limit | Cap output item count | Debug mode: process only N items |
Set Node: Add/Modify/Delete Fields
Set is n8n's most-used data processing node — add new fields, modify existing values, or remove unwanted fields from items.
Three operating modes:
- Keep All Fields (default): Preserve all original fields; add/modify specified ones
- Replace All Fields: Discard all original fields; keep only fields you explicitly add (useful for stripping noise)
- Delete Fields: Remove only the specified fields; keep everything else
Set vs Edit Fields: Use Set to add computed fields or change a few values; use Edit Fields for bulk renaming or large-scale field mapping. In modern n8n, Edit Fields replaces the older Rename Keys node.
Code Node: JavaScript and Python Custom Logic
The Code node is n8n's most powerful data tool — write any JavaScript or Python you need, executed directly in the workflow without an external server.
JavaScript Mode (Recommended)
// Pattern 1: Transform each item
return items.map(item => {
const order = item.json;
const discountRate = order.tier === 'gold' ? 0.85 : 0.95;
const finalPrice = (order.originalPrice * discountRate).toFixed(2);
const orderDate = new Date(order.createdAt);
const formattedDate = orderDate.toLocaleDateString('en-US');
return {
json: {
...order,
finalPrice,
formattedDate,
isHighValue: parseFloat(finalPrice) > 1000,
}
};
});
// Pattern 2: Collapse all items into a single summary item
const summary = {
orders: items.map(i => i.json),
totalAmount: items.reduce((sum, i) => sum + i.json.amount, 0),
count: items.length,
};
return [{ json: summary }];
Python Mode
# Python mode: statistical analysis
orders = [item['json'] for item in _input.all()]
status_counts = {}
for order in orders:
status = order.get('status', 'unknown')
status_counts[status] = status_counts.get(status, 0) + 1
amounts = [order.get('amount', 0) for order in orders]
avg_amount = sum(amounts) / len(amounts) if amounts else 0
return [{
'json': {
'statusBreakdown': status_counts,
'avgAmount': round(avg_amount, 2),
'totalOrders': len(orders),
}
}]
Split In Batches: Handle Large Datasets
When processing thousands of records, Split In Batches divides items into chunks so you can process them incrementally without memory pressure or API rate limit violations.
Usage pattern: Split In Batches has two output ports: Loop (current batch → your processing node → loops back) and Done (fires after all batches complete → post-processing). The Loop port feeds back into Split In Batches automatically, creating the batch iteration.
Aggregate Node: Merge Multiple Items Into One
Aggregate is the reverse of item-per-item processing: it collapses multiple items into a single item. Two modes:
- Append All Items to Array: Collect all items' json into a single array field
- Aggregate Individual Fields: Run aggregation operations (sum, count, max/min) on specific fields
Real-World Example: Clean E-Commerce Order Data
Complete Code node that cleans raw Shopify API orders into an internal CRM format:
// Input: raw Shopify API order array
// Output: cleaned data matching internal CRM schema
const cleanOrders = items
// 1. Filter: only fulfilled, paid orders
.filter(item => {
const o = item.json;
return o.fulfillment_status === 'fulfilled' && o.financial_status === 'paid';
})
// 2. Transform: rename fields, convert types, compute derived fields
.map(item => {
const o = item.json;
const customer = o.customer || {};
const shippingAddr = o.shipping_address || {};
const totalQuantity = (o.line_items || [])
.reduce((sum, li) => sum + (li.quantity || 0), 0);
const totalPrice = parseFloat(o.total_price || 0);
return {
json: {
orderId: o.id?.toString(),
orderNo: o.order_number?.toString(),
orderDate: o.created_at ? o.created_at.substring(0, 10) : '',
customerName: `${customer.first_name || ''} ${customer.last_name || ''}`.trim(),
customerEmail: customer.email || '',
totalAmount: totalPrice,
currency: o.currency || 'USD',
itemCount: totalQuantity,
shippingCity: shippingAddr.city || '',
isHighValue: totalPrice >= 500,
source: 'shopify',
syncedAt: new Date().toISOString(),
}
};
})
// 3. Deduplicate by orderId
.filter((item, index, arr) =>
arr.findIndex(a => a.json.orderId === item.json.orderId) === index
);
return cleanOrders;
Data quality validation before database insertion:
const validItems = [];
const staticData = $getWorkflowStaticData('global');
const rejected = [];
for (const item of items) {
const o = item.json;
const errors = [];
if (!o.orderId) errors.push('Missing orderId');
if (!o.customerEmail?.includes('@')) errors.push('Invalid email');
if (typeof o.totalAmount !== 'number' || o.totalAmount < 0) errors.push('Invalid amount');
if (errors.length === 0) {
validItems.push(item);
} else {
rejected.push({ ...o, _errors: errors });
}
}
// Store rejected items in static data for later review
staticData.lastRejected = rejected;
return validItems;
Best practice: For data cleaning workflows, do all three steps — filter → transform → validate — inside a single Code node rather than spreading them across multiple Filter and Set nodes. One consolidated Code node is easier to maintain, clearer in logic, and faster to execute.