Case Study: Enterprise Knowledge Assistant — Full Delivery Lifecycle
Chapter 21: Project — Enterprise Internal Knowledge Assistant End-to-End
Using a real 800-person manufacturing company case, this chapter walks the complete journey from requirements research through architecture design, knowledge base construction, RAG tuning, integration, and launch.
Chapter Overview
The "enterprise internal knowledge assistant" is one of Dify's most common deployment scenarios. But most teams make the same mistake: dump all documents into a knowledge base, tweak a few parameters, go live, then discover users are unsatisfied with the answers.
A genuinely successful enterprise knowledge assistant requires six stages: requirements deep-dive → document governance → tiered knowledge base design → RAG tuning → integration development → user acceptance → continuous iteration.
Case background: A precision manufacturing company (800 employees) with the following knowledge assets:
- HR documents: employee handbook, attendance policy, benefits (approx. 500 pages PDF)
- Technical documents: product specifications, process manuals (approx. 3,000 pages, some with drawings)
- Compliance documents: ISO 9001/14001 certification files, customer audit questionnaires
- IT documents: internal system user guides (approx. 200 pages)
Target: Within 3 months of launch, increase the rate of employees self-resolving issues via AI from 20% to 65%, and reduce repetitive HR and IT support tickets by 40%.
Level 1: Core Concepts (1–3 Years Experience)
Requirements Research: Finding the Real Pain Points
Before writing any code, spend 2 weeks on requirements research.
Method 1: Shadow work
Follow HR staff and IT engineers for 2 days, recording:
- Types of queries received each day
- Time spent answering each query
- Which questions have standard answers (suitable for AI)
- Which require judgment calls (AI-assisted human handling)
Actual research findings from this company:
| Query Type | Daily Volume | Avg. Handling Time | AI-Solvable Rate |
|---|---|---|---|
| Leave application process | 23 | 5 min | 95% |
| Expense reimbursement rules | 18 | 8 min | 90% |
| Social insurance questions | 12 | 15 min | 70% |
| System operation questions | 31 | 12 min | 80% |
| Product specification queries | 15 | 20 min | 85% |
| Supplier certification requirements | 8 | 30 min | 60% |
Document Governance: Garbage In, Garbage Out
Knowledge base quality directly determines AI answer quality.
Problem 1: Version chaos
Companies often have multiple versions of the same document (2019 version, 2021 version, latest). Mixed together, the AI gives outdated information.
Solution: Establish document version control:
Naming convention: [doc_type]_[version]_[effective_date].pdf
Example: employee_handbook_v3.2_20240101.pdf
Upload rules:
- When a new version is uploaded, retire older versions of the same type
- Add metadata to each Dify document: version, effective_date, department
Problem 2: Scanned PDFs with no extractable text
Manufacturing companies often have many scanned PDFs. Dify's default PDF parser cannot process them.
Solution: Pre-processing pipeline:
import pytesseract
from pdf2image import convert_from_path
from pathlib import Path
def ocr_pdf(input_path: str, output_path: str, lang: str = 'eng+chi_sim') -> str:
images = convert_from_path(input_path, dpi=300)
text_pages = []
for i, image in enumerate(images):
text = pytesseract.image_to_string(image, lang=lang)
text_pages.append(f"=== Page {i+1} ===\n{text}")
full_text = '\n\n'.join(text_pages)
with open(output_path, 'w', encoding='utf-8') as f:
f.write(full_text)
return full_text
def batch_process(input_dir: str, output_dir: str):
Path(output_dir).mkdir(exist_ok=True)
for pdf in Path(input_dir).glob('**/*.pdf'):
out_file = Path(output_dir) / pdf.with_suffix('.txt').name
if not out_file.exists():
print(f"Processing: {pdf.name}")
ocr_pdf(str(pdf), str(out_file))
Creating the Knowledge Base in Dify
Step 1: Layered knowledge base structure
Do not put all documents in one knowledge base. Separate by domain:
Knowledge bases:
├── kb-hr — HR policies and procedures
├── kb-it — IT system operation
├── kb-product — Product technical documentation
└── kb-quality — Quality and compliance
Step 2: Chunking configuration
In Dify knowledge base settings:
Segmentation rule: Automatic
Max segment length: 500 tokens
Segment overlap: 50 tokens
Embedding model: text-embedding-ada-002
Step 3: Retrieval configuration
Retrieval mode: Hybrid (vector + keyword)
Vector weight: 0.7
Keyword weight: 0.3
Top K: 5
Similarity threshold: 0.5
Re-ranking: Enabled (BGE Reranker)
Level 2: Mechanism Deep Dive (3–5 Years Experience)
Architecture: Multi-App with Routing Layer
The company adopted a "single unified entry + multiple specialized assistants" architecture:
User Entry (Enterprise WeChat Bot / Internal Portal)
|
v
Routing Layer (Dify Workflow)
+-----------------------------+
| Intent classification: |
| - HR queries → HR bot |
| - IT queries → IT bot |
| - Product queries → Product |
| - Quality queries → Quality |
| - Other → General bot |
+-----------------------------+
|
+-----+-----+------------+----------+
v v v v
HR Bot IT Bot Product Bot Quality Bot
(RAG) (RAG) (RAG) (RAG)
| | | |
kb-hr kb-it kb-product kb-quality
RAG Parameter Tuning
| Parameter | Default | Recommended | Reason |
|---|---|---|---|
| Chunk Size | 500 tokens | 400 tokens | Dense information in policy docs; smaller chunks are more precise |
| Chunk Overlap | 50 tokens | 80 tokens | Policy docs have cross-paragraph dependencies |
| Top K | 3 | 5 | HR questions may involve multiple regulations |
| Similarity threshold | 0.5 | 0.6 | Higher precision, less noise |
| Reranker | Off | On | Significantly improves ranking quality |
Hybrid Search Internal Mechanism
Dify's hybrid search combines BM25 (keyword matching) with vector search (semantic similarity) using Reciprocal Rank Fusion (RRF):
def reciprocal_rank_fusion(rankings: list, k: int = 60) -> list:
scores = {}
for ranking in rankings:
for rank, doc_id in enumerate(ranking):
if doc_id not in scores:
scores[doc_id] = 0
scores[doc_id] += 1 / (k + rank + 1)
return sorted(scores.keys(), key=lambda x: scores[x], reverse=True)
System Prompt Engineering
HR assistant System Prompt after 6 iterations:
You are the HR AI assistant for [Company Name], responsible for answering
employee questions about company policies, benefits, and procedures.
## Your principles:
1. **Answer only from the knowledge base**: If no relevant information exists,
say clearly "I cannot find information about this in the knowledge base.
Please contact HR directly."
2. **Cite document sources**: Note which document the information comes from,
e.g., "According to the Employee Handbook, Chapter 3..."
3. **Give complete workflows**: For process questions (leave, reimbursement),
list every step completely.
4. **Distinguish rule absoluteness**: Clearly differentiate "must" (regulatory
requirement) from "recommended" (company encouragement).
5. **Transfer sensitive issues to humans**: For personal salary data,
performance disputes, or labor arbitration, guide employees to HR staff.
6. **Professional and friendly tone**: Use polite, professional language.
## What you must NOT do:
- Guess or infer policies (only cite existing documents)
- Promise exceptions or special handling
- Comment on the reasonableness of company policies
- Reveal other employees' personal information
Enterprise WeChat Integration
import requests
from flask import Flask, request
app = Flask(__name__)
DIFY_API_URL = "https://dify.yourcompany.com/v1"
DIFY_APP_TOKEN = "your-dify-app-token"
conversation_sessions = {} # Use Redis in production
@app.route('/wechat/callback', methods=['POST'])
def wechat_callback():
data = parse_wechat_message(request.data)
if data['MsgType'] != 'text':
return reply_text(data['FromUserName'], "Text messages only, please.")
user_id = data['FromUserName']
user_query = data['Content']
conversation_id = conversation_sessions.get(user_id)
response = requests.post(
f'{DIFY_API_URL}/chat-messages',
headers={'Authorization': f'Bearer {DIFY_APP_TOKEN}'},
json={
'inputs': {},
'query': user_query,
'response_mode': 'blocking',
'conversation_id': conversation_id or '',
'user': 'enterprise_wechat'
},
timeout=30
)
result = response.json()
conversation_sessions[user_id] = result.get('conversation_id')
answer = result.get('answer', 'Sorry, I cannot answer this at the moment.')
return reply_text(user_id, answer)
Level 3: Source Code and Architecture (5+ Years)
RAG Quality Evaluation Framework
from ragas import evaluate
from ragas.metrics import faithfulness, answer_relevancy, context_recall, context_precision
from datasets import Dataset
def evaluate_rag_quality(test_cases: list) -> dict:
"""
Evaluate RAG system quality using RAGAS framework.
test_cases format:
[
{
"question": "How many days of annual leave?",
"contexts": ["According to the handbook, employees with 1-10 years..."],
"answer": "Employees with 1-10 years receive 5 days of annual leave...",
"ground_truth": "1-10 years: 5 days; 10+ years: 10 days"
}
]
"""
dataset = Dataset.from_list(test_cases)
return evaluate(
dataset,
metrics=[faithfulness, answer_relevancy, context_recall, context_precision]
)
def weekly_quality_check(test_set: list):
for case in test_set:
response = call_dify_with_context(case['question'])
case['contexts'] = response['retrieval_documents']
case['answer'] = response['answer']
metrics = evaluate_rag_quality(test_set)
if metrics['faithfulness'] < 0.75:
send_alert(f"RAG faithfulness dropped to {metrics['faithfulness']:.2f}")
return metrics
Automated Document Sync
import hashlib, json, os
from pathlib import Path
import requests
class DifyKnowledgeBaseSync:
def __init__(self, dataset_id: str, api_key: str, api_url: str):
self.dataset_id = dataset_id
self.api_key = api_key
self.api_url = api_url
self.state_file = f".sync_state_{dataset_id}.json"
self.state = self._load_state()
def _load_state(self) -> dict:
if os.path.exists(self.state_file):
with open(self.state_file) as f:
return json.load(f)
return {}
def _save_state(self):
with open(self.state_file, 'w') as f:
json.dump(self.state, f)
def _file_hash(self, file_path: str) -> str:
with open(file_path, 'rb') as f:
return hashlib.md5(f.read()).hexdigest()
def sync_directory(self, docs_dir: str):
docs_path = Path(docs_dir)
for doc_file in docs_path.glob('**/*'):
if not doc_file.is_file():
continue
if doc_file.suffix not in ['.pdf', '.docx', '.txt', '.md']:
continue
file_path = str(doc_file)
current_hash = self._file_hash(file_path)
if file_path not in self.state:
print(f"New document: {doc_file.name}")
doc_id = self._upload_document(file_path)
self.state[file_path] = {'hash': current_hash, 'dify_doc_id': doc_id}
elif self.state[file_path]['hash'] != current_hash:
print(f"Updated document: {doc_file.name}")
self._delete_document(self.state[file_path]['dify_doc_id'])
doc_id = self._upload_document(file_path)
self.state[file_path] = {'hash': current_hash, 'dify_doc_id': doc_id}
self._save_state()
Level 4: Production Traps and Decisions (Expert Perspective)
Pre-Launch Acceptance Checklist
Before opening to all employees:
Functional tests (minimum 50 test cases):
✅ Accuracy rate on standard questions ≥ 90%
✅ Refuses to answer questions outside knowledge base (no hallucination)
✅ Retains multi-turn conversation context correctly
✅ Proactively requests clarification on ambiguous questions
✅ Correctly escalates sensitive questions to humans
✅ Handles special characters and emoji without crashing
✅ Correctly handles overly long inputs (> 2,000 characters)
Performance test targets (800 employees, 50 concurrent users):
- P50 response time < 3s
- P95 response time < 8s
- Error rate < 1%
- Throughput ≥ 20 requests/second
3-Month Post-Launch Retrospective
Month 1: Users didn't trust AI answers
Symptom: Users received AI answers but still confirmed with HR.
Root cause: No source citations; users couldn't assess reliability.
Fix: Modified System Prompt to always append sources:
---
Source: Employee Handbook, Chapter X (v3.2, effective January 2024)
Questions? Contact HR: [email protected] | Ext. 1234
Result: User satisfaction improved from 62% to 81%.
Month 2: Technical document retrieval inaccurate
Symptom: Product spec tables (material parameters, dimensional tolerances) couldn't be found accurately.
Root cause: PDF table structure was lost when converted to plain text.
Fix: Use camelot for table extraction:
import camelot
def extract_tables_from_pdf(pdf_path: str) -> str:
tables = camelot.read_pdf(pdf_path, pages='all', flavor='stream')
markdown_tables = []
for i, table in enumerate(tables):
md_table = table.df.to_markdown(index=False)
markdown_tables.append(f"## Table {i+1}\n{md_table}")
return '\n\n'.join(markdown_tables)
Month 3: Answer quality degraded in long conversations
Symptom: After 5+ conversation turns, the AI started mixing up context from different turns.
Root cause: Growing conversation history filled the context window, causing retrieved documents to be compressed or dropped.
Fix: Limit history tokens and auto-summarize older turns:
MAX_HISTORY_TOKENS = 2000
def manage_conversation_history(history: list, max_tokens: int) -> list:
total_tokens = sum(estimate_tokens(msg) for msg in history)
if total_tokens <= max_tokens:
return history
recent = history[-6:] # Keep last 3 turns
older = history[:-6]
if older:
summary = summarize_history(older)
return [{'role': 'system', 'content': f'[Earlier summary]\n{summary}'}] + recent
return recent
Knowledge Base Scaling Pitfalls
When a single knowledge base exceeds 10,000 chunks, a counterintuitive phenomenon appears: more documents actually reduces retrieval precision. The vector space becomes so dense that similarity scores compress into a narrow band (0.6–0.7), making it impossible to distinguish relevant from irrelevant results.
Solutions:
- Domain sharding: Keep each knowledge base under 5,000 chunks
- Raise similarity threshold from 0.5 to 0.65
- Add metadata filtering: filter by department/doc_type before vector search
- Regular cleanup: remove outdated and duplicate chunks
Chapter Summary
Project timeline reference:
| Phase | Duration | Key Deliverables |
|---|---|---|
| Requirements research | Weeks 1–2 | Problem inventory, priority ranking |
| Document governance | Weeks 3–4 | Clean document library, version control rules |
| Knowledge base construction | Weeks 5–6 | 4 knowledge bases, documents indexed |
| Application development | Weeks 7–9 | Dify app configuration, integration code |
| Testing and tuning | Weeks 10–11 | Acceptance tests passed, parameters optimized |
| Pilot rollout | Week 12 | 50-user pilot, feedback collected |
| Full launch | Weeks 13–14 | All-staff access, training delivered |
| Continuous iteration | Ongoing | Weekly data analysis, monthly knowledge base refresh |
Critical success factors:
- Document quality first: 70% of RAG quality issues stem from the documents themselves (outdated, poor format, chaotic structure).
- Domain-separated knowledge bases: Never try to solve everything with one monolithic knowledge base.
- Continuous evaluation: Build a Golden Set of test cases and run automated quality checks weekly.
- User feedback loop: Always provide a "Was this answer helpful?" mechanism and act on the data.
- Human fallback: Always preserve an easy path to reach a real person — AI should never be the final wall users hit.