Technology – NeuroCollective Research

How It Works

The Technology

A language-agnostic knowledge graph requires rethinking every layer of the knowledge representation stack — from how individual words are grouped, to how concepts are encoded, to how queries resolve across linguistic boundaries.

Architecture

Three-Layer Semantic Stack

The NeuroCollective architecture separates linguistic surface forms from conceptual meaning through three distinct layers, each with a specific function.

Layer 1

Lexeme Groups

Surface form variants for a single concept within a language. walk, walking, walked, walker — different morphological forms, same underlying meaning. Grouped by language, normalized by a language-specific lemmatizer.

Layer 2

Interlinguals

Cross-language concept bridges. An interlingual node links the English lexeme group for WALK to the German group (Lauf, laufen), the French group (marcher, marche), and so on. Informed by multilingual lexical resources and typological data (Grambank, WALS).

Layer 3

SupraConcepts

The language-neutral semantic core. A SupraConcept node is the canonical representation of a meaning — no language attached, no string label, just a unique identifier and a set of semantic relationships to other SupraConcepts. Every lexeme in every language resolves to a SupraConcept.

Interactive Demo

SupraConcept Resolution

Seven lexemes across four languages. One SupraConcept. The graph below is a static demonstration — the actual knowledge graph is not publicly accessible.

Static demonstration — pre-loaded mock data only. No live KG access.

Why It Matters

The Problem with Language-Dependent KGs

English Centrism

Most knowledge graphs — including Wikidata, ConceptNet, and enterprise data dictionaries — use English as the organizing principle. Non-English data is either translated or excluded.

Translation Bottleneck

Bridging knowledge across languages via translation introduces errors, loses nuance, and fails entirely on concepts that don’t have clean translations. Some concepts only exist in specific languages.

Maintenance at Scale

A language-dependent KG serving N languages requires N separate maintenance tracks. Add a language, double the work. New languages create Lexeme Groups that are attached to Interlinguals without touching the semantic core. NeuroCollective’s architecture scales linearly with language coverage.

Query Failure

When a user queries in Thai, they shouldn’t miss results that were entered in Finnish. Language-agnostic graph traversal means the query language and the data language are irrelevant to retrieval quality. Human knowledge does not change if it is presented in a different language than originally recorded.

Read the Research

Explore the active research tracks behind the technology.

View Research
Get Involved