We shipped the Orb — our persistent memory system — to 38 beta users four weeks ago. Here is what we learned.
Density is not linear
We expected memory density to grow roughly linearly with usage. It does not. Three of our users produced 60% of the inner-tier facts in the corpus; the median user produced about 240 crystallized facts in four weeks. Heavy users interact in fact-dense ways — scheduling, mail triage, contact management — which the consolidation pass picks up on. Casual users produce much lower density.
This is fine. The architecture scales with density, not wall-clock time, and the heavy users report the largest "this knows me" effect.
Recall failures are specific
The failure mode we expected — the Orb fails to surface a relevant fact — is rare. The failure we actually see is retrieval of the wrong fact when two facts are similar. ("What time is my meeting with Sam?" pulls the fact about last week's Sam meeting.)
We're mitigating this with a re-ranking head conditioned on temporal context. Recall@1 climbs from 88.9% to 93.1% in internal eval. Rolling out this week.
Users audit their memory
We did not expect this. Users regularly open the Orb inspector and read their own memory. Some curate it — edit facts, delete things, correct misattributions. This is a behavior we did not design for, but it turns out to be one of the most important trust-building loops in the product.
We are leaning into it. A forthcoming release adds explicit "forget this" and "correct this" affordances right from any response that cites memory.