Archive notice: The supported FluxEM product is fluxem-tools. This page documents the archived research track. See the repo README for the active tools-first scope.
encode(a) + encode(b) = encode(a + b)

What if embeddings were algebraic?

Deterministic encodings where arithmetic operations become vector operations: no training, no weights—structure by construction.

encode(a) encode(b) encode(a + b)
Read the idea View on GitHub

We've been teaching neural networks arithmetic the hard way

For a decade we've scaled parameters and tuned attention mechanisms. Models can discuss calculus and explain quantum mechanics, yet they still stumble on multiplication.

It's not a capacity problem; it's representation. Tokenize "1234" into arbitrary symbols and you're asking the model to rediscover—through gradient descent—that those symbols have algebraic structure, again and again.

FluxEM takes the opposite tack: encode numbers so that vector addition is arithmetic addition. The structure is given up front, not learned.

And it doesn't stop at arithmetic: physical units, molecular formulas, musical pitch classes, logical propositions—any domain with algebraic structure can be embedded so its operations become geometric.

Tool calls scale capability without changing the model

FluxEM keeps the base LLM intact and teaches it to delegate. Each query retrieves a small, relevant tool set using embeddings + FAISS plus a domain router, then the model chooses tools and executes deterministic calls.

This is external embedding scaling: capacity grows in the tool index and retrieval layer, not in the model's internal embedding tables.

Eleven domains

Existence proofs for a general principle

Physics

9.8 m/s²

Dimensional quantities with unit conversion and dimensional analysis

Chemistry

C₆H₁₂O₆

Molecular formulas with stoichiometric operations and mass balance

Biology

ATGCCGTAG

DNA sequences with GC content, melting temperature, translation

Mathematics

3 + 4i

Complex numbers, matrices, vectors, polynomials, rationals

Logic

p ∧ q → r

Propositional and predicate logic with satisfiability detection

Music

{0, 4, 7}

Pitch classes, chords, scales, atonal set theory with TnI operations

Geometry

△ABC

Shapes with area, centroid, circumcenter, containment tests

Graphs

G = (V, E)

Adjacency structures with connectivity, cycles, shortest paths

Sets

A ∪ B

Finite sets and relations with union, intersection, composition

Number Theory

360 = 2³·3²·5

Prime factorization, modular arithmetic, Euler totient

Data

[x₁, x₂, ...]

Arrays, records, and tables as structured embeddings

Five minutes to working code

# Install
pip install fluxem            # or: pip install fluxem[jax]

# Use
from fluxem import create_unified_model

model = create_unified_model()
model.compute("12345 + 67890")   # → 80235.0
model.compute("144 * 89")        # → 12816.0

# Or work with specific domains
from fluxem.domains.music import AtonalSetEncoder, prime_form
from fluxem.domains.chemistry import MoleculeEncoder, Formula

enc = AtonalSetEncoder()
major = enc.encode([0, 4, 7])     # C major triad as pitch-class set
prime_form([0, 4, 7])            # → (0, 3, 7)

mol = MoleculeEncoder()
glucose = mol.encode(Formula.parse('C6H12O6'))
"When you encode domains with their algebra intact, discoveries in one domain become visible in another."

— from "Where this idea comes from"

Read the full story