An NLP-Driven Framework for Curriculum-Labor Market Alignment: Schema-Constrained LLM Extraction, ESCO-Anchored Semantic Matching, and Multi-Dimensional Gap Quantification
Title: Bridging the Curriculum-Labor Market Divide: A Novel NLP Framework Utilizing Schema-Constrained LLM Extraction, ESCO-Based Semantic Matching, and Comprehensive Gap Analysis
Abstract:
Extracting structured information from disparate educational and labor-market texts via Natural Language Processing (NLP) remains a significant hurdle. Current methodologies predominantly depend on lexical-surface techniques, which frequently fail to identify implicit competencies, lack integration with shared taxonomies, and offer no formal metrics for extraction reliability or document completeness. To overcome these deficiencies, this study introduces a comprehensive four-stage NLP framework. This approach integrates (i) schema-constrained prompting of a dual-model frontier Large Language Model (LLM) ensemble against a JSON Schema-enforced seven-slot competency formalism; (ii) the alignment of extracted records with the eleven-domain ESCO v1.2.1 controlled vocabulary using Sentence-BERT (SBERT); (iii) a two-tier adjudication protocol designed to resolve discrepancies between models; and (iv) a robust verification mechanism employing per-slot Cohen’s kappa, schema conformance checks, and document-level completeness audits.
We demonstrate the framework’s utility through a critical higher-education quality assurance application: aligning curricula with labor market demands for the ABET-accredited BSc Computer Science program at the United Arab Emirates University. The pipeline processed the 2025-2026 study plan, extracting 400 competency records from 85 courses. These were aligned with 30 job postings containing 483 requirement clauses, utilizing an SBERT cosine similarity threshold of 0.50. This alignment was conducted across a five-scope analysis, ranging from the computing core to a probability-weighted student trajectory.
The extraction component demonstrated high reliability, achieving a Cohen’s kappa of 0.79 for the skill slot, alongside 100% schema conformance and document-level completeness. The alignment results revealed interpretable supply-demand gaps: 25.0% in general and transversal skills, 13.8% in algorithms and computational theory, and 12.2% in software engineering and project management. Notably, the gap in artificial intelligence and data science was negligible at 1.8%, despite a supply coverage of 38.6%.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




