Knowledge Graph
Schema-driven KG building, vector embeddings, semantic search, impact analysis
Overview
The Knowledge Graph (KG) is a schema-driven graph built automatically from PostgreSQL sources via the 3-schema system. It provides semantic search, impact analysis, and natural language chart generation for manufacturing data.
Unlike traditional ETL approaches, the KG builder reads domain configuration templates and autonomously discovers, maps, and synchronizes data into a Neo4j graph database — with vector embeddings for semantic retrieval.
Architecture
The two-pass build ensures all nodes exist before edges are created, avoiding dangling references. PG LISTEN/NOTIFY provides real-time synchronization when source data changes.
Node Types
Node types are configurable via domain templates. A typical discrete manufacturing setup includes:
- Machine — CNC centers, assembly stations, test fields
- Article — Finished products, semi-finished goods
- Order — Production orders, customer orders
- Material — Raw materials, purchased parts
- Supplier — Material and component suppliers
- Tool — Cutting tools, fixtures, gauges
- Sensor — OPC-UA tags, MQTT variables
- CNC Program — NC programs linked to operations
Relationships
Edges encode manufacturing semantics and supply chain dependencies:
PRODUCESMachine → ArticleWORKS_ONMachine → OrderUSES_TOOLMachine → ToolHAS_BOMArticle → MaterialSUPPLIED_BYMaterial → SupplierREQUIRES_PROGRAMMachine → CNC ProgramHAS_SENSORMachine → SensorDEPENDS_ONOrder → OrderMCP Tools (8 KG Tools)
The Knowledge Graph exposes 8 tools via MCP for LLM-driven queries:
kg_searchSemantic search across all node types using vector embeddings
kg_get_nodeRetrieve a specific node by ID with all properties and edges
kg_get_neighborsGet all neighbors of a node, optionally filtered by type or relationship
kg_shortest_pathFind the shortest path between two nodes in the graph
kg_impact_analysisTrace upstream/downstream impact of a node change (e.g., supplier delay)
kg_cypher_queryExecute arbitrary Cypher queries for advanced analysis
kg_statisticsGet node/edge counts, type distributions, graph health metrics
kg_chartNatural language → Cypher → interactive chart (bar, line, pie, scatter)
Discovery Tools
Two additional tools support machine and sensor discovery:
kg_discovered_machinesList all machines discovered from OPC-UA, MQTT, and database sources with their connection status and metadata.
kg_machine_sensorsList all sensors and variables attached to a specific machine, including data types, units, and current values.
Domain Templates
The KG builder ships with configurable domain templates for different industries:
- Discrete Manufacturing — CNC, assembly, BOM, OEE
- Pharma — Batch records, GMP compliance, equipment qualification
- Chemical — Process units, recipes, SIL levels, material flows
- Medical Devices — UDI tracking, DHR, CAPA, sterilization
Vector Search
Every node in the graph is enriched with a vector embedding generated by the local LLM. This enables semantic search using natural language:
Vector search is used by the kg_search tool and automatically falls back to keyword search when embeddings are unavailable.
Chart Engine
The kg_chart tool converts natural language questions into Cypher queries, executes them against Neo4j, and returns interactive chart configurations:
- Bar charts — OEE by machine, defects by type
- Line charts — Production trends, sensor data over time
- Pie charts — Material distribution, order status breakdown
- Scatter plots — Correlation analysis (cycle time vs. quality)
OPC-UA & MTP Integration
The KG integrates with OPC-UA and MTP (VDI 2658) to automatically extract equipment models from AutomationML files and CESMII Smart Manufacturing Profiles. Parsed modules, services, and variables are merged into the graph schema with full ISA-95 hierarchy.