Concepts
Semantic Graph Verification

Semantic Verification

How GlyphNet understands meaning, not just text.

Beyond Keyword Matching

Traditional verification systems match exact words. GlyphNet understands semantic equivalence:

# These all verify as the SAME claim:
claims = [
    "The Eiffel Tower is located in Paris",
    "Paris contains the Eiffel Tower",
    "You can find the Eiffel Tower in Paris, France",
    "The Eiffel Tower stands in the French capital"
]
 
for claim in claims:
    result = client.verify(claim)
    print(f"{result['confidence']:.2f}")  # All return ~0.95

Semantic Embeddings

Every concept is represented as a 384-dimensional vector:

"dog"    → [0.12, -0.34, 0.56, ...]
"canine" → [0.11, -0.33, 0.55, ...]  # Very similar
"cat"    → [0.08, -0.21, 0.42, ...]  # Related but distinct
"chair"  → [-0.45, 0.23, -0.12, ...] # Very different

Similarity Measurement

Semantic similarity is measured using cosine distance:

similarity(dog, canine) = 0.97  # Nearly identical meaning
similarity(dog, cat)    = 0.72  # Related concepts
similarity(dog, chair)  = 0.15  # Unrelated

Claim Decomposition

Complex statements are broken into atomic claims:

Input:
"Albert Einstein developed the theory of relativity in 1905
while working at the Swiss patent office."

Decomposed Claims:
1. "Albert Einstein developed the theory of relativity"
2. "The theory of relativity was developed in 1905"
3. "Albert Einstein worked at the Swiss patent office"
4. "Einstein developed relativity while at the patent office"

Each claim is verified independently, then results are combined.

Relation Understanding

GlyphNet recognizes semantic relationships:

Equivalence

"CEO" ≈ "Chief Executive Officer"
"NYC" ≈ "New York City"
"USA" ≈ "United States of America"

Hierarchy

"dog" is_a "mammal" is_a "animal"
"Paris" part_of "France" part_of "Europe"

Properties

"water" has_property "liquid" (at room temperature)
"sun" has_property "hot"

Handling Ambiguity

Context Resolution

When terms have multiple meanings, GlyphNet uses context:

# "Apple" in technology context
result = client.verify("Apple released the iPhone in 2007")
# Correctly identifies Apple Inc., not the fruit
 
# "Apple" in food context
result = client.verify("Apples are a healthy fruit rich in fiber")
# Correctly identifies the fruit

Disambiguation Signals

GlyphNet uses multiple signals:

  • Surrounding words and phrases
  • Document-level context
  • Common usage patterns
  • Entity type constraints

Semantic Similarity Thresholds

ThresholdUse Case
0.95+Exact semantic match
0.85-0.94Strong equivalence
0.70-0.84Related concepts
0.50-0.69Weak relation
< 0.50Unrelated

Negation Handling

GlyphNet correctly handles negation:

# Positive claim
result = client.verify("The Earth orbits the Sun")
# verified: true, confidence: 0.98
 
# Negated claim
result = client.verify("The Earth does not orbit the Sun")
# verified: false, confidence: 0.02 (correctly identified as false)
 
# Double negation
result = client.verify("It is not true that the Earth doesn't orbit the Sun")
# verified: true, confidence: 0.95

Quantitative Claims

Numbers and quantities are semantically understood:

# Approximate matching
client.verify("Paris has about 2 million people")
# Matches knowledge that Paris population is ~2.1 million
 
# Range validation
client.verify("The boiling point of water is 100°C")
# Verified against known physical constants
 
# Unit conversion
client.verify("The marathon is 26.2 miles")  # or "42.195 km"
# Both verify correctly

Temporal Reasoning

GlyphNet understands time-sensitive claims:

# Historical facts
client.verify("World War II ended in 1945")
# verified: true (historical fact)
 
# Relative time
client.verify("Einstein was born before Hawking")
# verified: true (1879 < 1942)
 
# Current state (use with caution)
client.verify("Joe Biden is the US President")
# May vary based on knowledge cutoff

Limitations

What Works Well

  • Factual claims about well-known entities
  • Scientific facts and definitions
  • Historical events and dates
  • Geographic relationships

What's Challenging

  • Very recent news (< 30 days)
  • Niche or specialized domains
  • Subjective or opinion-based claims
  • Predictions about the future