Edges Matter More Than Nodes
Edges Matter More Than Nodes
TinyMind is a baby proof-of-concept: 1,600 nodes, 2,400 edges, learned through curiosity-driven conversation. It's not intelligent by any reasonable definition of the term, and it's barely useful for anything beyond demonstrating that certain ideas might work at larger scale. But building it taught me something I wasn't expecting, something that changed how I think about knowledge representation and what matters when you're trying to encode understanding in a graph structure.
What I Built
TinyMind starts with almost nothing. Two nodes: Self and Thing. Everything else emerges from conversation and document reading. The mechanism is straightforward: an LLM extracts relationships from input ("X applies to Y", "A is a basis for B", "P has consequence Q"), and each extraction creates or strengthens edges with confidence scores. Run this process enough times and patterns emerge. Not because I designed the patterns, but because topology self-organizes when you give it enough material to work with. The interesting part is what patterns actually emerged, because they weren't what I would have designed if I'd tried to specify them upfront.
The Edge Types That Emerged
Without me specifying them in advance, the system developed these relationship types: basis_for (foundational concepts supporting derived ones), has_consequence (causal and implication relationships), applies_to (scope relationships), contradicts (tension between ideas), supports (reinforcing relationships), and requires (dependency relationships). I didn't tell the system what kinds of relationships to look for. The LLM inferred them from context, and the graph now has semantic richness I wouldn't have thought to encode manually. If I had tried to design a schema upfront, I would have gotten it wrong. The emergent types fit the actual knowledge better than anything I would have specified, which is either evidence for the emergence hypothesis or a lucky accident. I genuinely don't know which.
Why Edges Carry More Information
Consider a node in isolation: "basketball." What do you know from the node itself? Properties like round, orange, bouncy, sports equipment. Fine, but not very interesting and not very useful for reasoning. Now consider an edge: basketball connected to soccer_game with a "plays_with" relationship. That tells you something entirely different. Context of use, associated activities, semantic proximity to other concepts. And critically, the relationship type ("plays_with" vs "part_of" vs "competes_with") encodes meaning that node properties don't capture. Multiply this across thousands of edges. The topology becomes knowledge. The structure of relationships carries more semantic content than the nodes themselves. A node is just a label. An edge is a claim about how two things relate, and claims are where the interesting information lives. This is probably obvious to anyone who's worked extensively with knowledge graphs, but it wasn't obvious to me going in, and TinyMind made it concrete in a way that reading papers hadn't.
Confidence as Selection Pressure
Every edge in TinyMind has a confidence score. High confidence means the relationship has been observed repeatedly and consistently. Low confidence means it was mentioned once and might be spurious. This creates selection pressure. Low-confidence edges don't get reinforced over time. They fade as the graph evolves. High-traffic connections become hubs. Trivia gets pruned. Meaningful patterns become structural. There's no explicit cleanup algorithm. No scheduled process that reviews edges and deletes the weak ones. Just confidence-weighted topology that self-organizes as new observations arrive. The structure improves without intervention because the mechanism favors reinforcement of useful patterns. Whether this actually produces good knowledge representations at scale is an open question, but at 1,600 nodes, it's at least producing something that looks like structure rather than noise.
The Curiosity Loop
TinyMind doesn't just accumulate passively. It generates exploration goals based on graph structure: "Region A connects to Region C through weak edges. What's in between?" "This high-value edge has low confidence. How can we verify it?" "This node has unusual connectivity patterns. What makes it special?" Topology guides exploration. The system notices gaps in its own knowledge and generates queries designed to fill them. It's not general intelligence by any stretch, but it is a feedback loop where structure informs inquiry, and that feedback loop is the part I'm most interested in scaling up.
What I Don't Know
I think this scales. I think at 10,000 or more observations, the emergent structure becomes meaningful enough for real reasoning tasks. But I haven't tested that hypothesis. TinyMind at 1,600 nodes is a toy, and toys don't tell you much about what works at production scale. The threshold for useful emergence is somewhere between 10,000 and 1,000,000 observations, depending on domain complexity and how much structure you need for the reasoning tasks you care about. I'm nowhere near that range yet. The signal I'm seeing could be genuine or it could be noise that happens to look like structure at small scale. I won't know until I can run experiments at larger scale, and those experiments require time and compute I haven't allocated yet.
What This Means for Sophia
TinyMind is a proof-of-concept for how Sophia will represent knowledge. The lesson I'm taking from it: don't over-engineer the schema. Specify minimal types. Let edge semantics emerge from experience rather than trying to anticipate them upfront. Intelligence, if Sophia ever achieves anything that deserves that label, won't be in the nodes. It'll be in the structure of relationships. But that's theory. TinyMind suggests the theory might be right. It doesn't prove it. Proof requires scale I don't have yet, and I have to be honest about the gap between "this looks promising at small scale" and "this actually works."
Lessons learned, post 3. Previous: Why You Can't Project V-JEPA Into CLIP