Concepts
How GlyphNet Works

How GlyphNet Works

Understanding the technology behind semantic verification.

The Verification Pipeline

Text Input → Claim Extraction → Semantic Encoding → Graph Matching → Confidence Scoring → Results

1. Claim Extraction

GlyphNet analyzes your text to identify discrete factual claims:

Input: "Paris is the capital of France and has a population of about 2 million."

Extracted Claims:
- "Paris is the capital of France"
- "Paris has a population of about 2 million"

Claims are extracted based on:

  • Subject-predicate-object structures
  • Quantitative assertions
  • Temporal statements
  • Causal relationships

2. Semantic Encoding

Each claim is converted into a high-dimensional semantic vector using our Large Concept Model:

"Paris is the capital of France"

[0.234, -0.891, 0.456, ..., 0.123]  (384 dimensions)

This encoding captures:

  • Meaning, not just words
  • Relationships between entities
  • Context and nuance

3. Knowledge Graph Matching

The encoded claim is matched against our structured knowledge graph:

         ┌─────────┐
         │  Paris  │
         └────┬────┘
              │ capital_of

         ┌─────────┐
         │  France │
         └─────────┘

The graph contains:

  • 30,000+ concepts with explicit relations
  • 80,000+ verified edges between concepts
  • 10 canonical relation types (is_a, has_part, requires, etc.)

4. Confidence Scoring

Each claim receives a confidence score based on:

FactorWeightDescription
Semantic similarity40%How well the claim matches known facts
Graph path strength30%Strength of relational connections
Source consensus20%Agreement across multiple paths
Recency10%Age of supporting evidence

5. Result Aggregation

Claims are combined into a verification result:

{
  "claims": [
    {
      "text": "Paris is the capital of France",
      "verified": true,
      "confidence": 0.98
    },
    {
      "text": "Paris has a population of about 2 million",
      "verified": true,
      "confidence": 0.85
    }
  ],
  "summary": {
    "total_claims": 2,
    "verified": 2,
    "avg_confidence": 0.915
  }
}

Key Concepts

Semantic vs. Syntactic Matching

Traditional fact-checking matches exact text. GlyphNet matches meaning:

These are equivalent to GlyphNet:
- "Paris is France's capital"
- "The capital of France is Paris"
- "France has Paris as its capital city"

Confidence Thresholds

ConfidenceInterpretation
0.90 - 1.00High confidence - strongly supported
0.70 - 0.89Medium confidence - likely accurate
0.50 - 0.69Low confidence - uncertain
0.00 - 0.49Unverified - no supporting evidence

The Flagging Decision

A response is flagged when:

  • Any claim has confidence below threshold (default: 0.7)
  • Claims contradict each other
  • Claims contradict known facts with high confidence

What GlyphNet Cannot Verify

Outside Scope

  • Opinions: "Python is the best language"
  • Predictions: "It will rain tomorrow"
  • Personal experiences: "I enjoyed the movie"
  • Very recent events: News from the past 24 hours

Knowledge Limitations

  • Highly specialized domains may have limited coverage
  • Rapidly changing information may be outdated
  • Fictional content is not verified

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│  API LAYER                                              │
│  REST endpoints, authentication, rate limiting          │
├─────────────────────────────────────────────────────────┤
│  VERIFICATION ENGINE                                    │
│  Claim extraction, semantic matching, scoring           │
├─────────────────────────────────────────────────────────┤
│  LARGE CONCEPT MODEL (LCM)                              │
│  384-dimensional semantic embeddings                    │
├─────────────────────────────────────────────────────────┤
│  KNOWLEDGE GRAPH                                        │
│  30K concepts, 80K relations, typed edges               │
└─────────────────────────────────────────────────────────┘

Performance Characteristics

MetricTypical Value
Latency50-200ms per request
Throughput500+ requests/second
Accuracy94% on benchmark datasets
False positive rate< 3%