Lewis & Clark Research Database

Findings

Researcher's-eye overviews of patterns that emerge from reading the full corpus in aggregate. Drafted by site engineers; not authoritative scholarly positions. Each claim links to the underlying queryable data.

Cross-Narrator Parallels at Fort Clatsop

When a near-duplicate audit surfaces a documented historical pattern, not a bug

May 12, 2026 · Ryan Abrahamsen — Lewis and Clark Trust

A near-duplicate audit (MinHash LSH + SequenceMatcher) found 50 confirmed clusters of paraphrased journal text, all of them Clark/Lewis pairs on the same date during the Fort Clatsop winter. This is a documented historical phenomenon — at Fort Clatsop the captains often kept near-identical journals. The 50 pairs were flagged with parallels_entry meta and surfaced as a cross-reference card on each single-entry page. Computational text analysis is being used here to confirm and make visible something scholars have known for a century.

An Editorial Audit: Finding and Replacing 207 Duplicate Entries

A routine MD5 audit, 35 duplicate clusters, and what it suggests about AI-augmented archives

May 12, 2026 · Ryan Abrahamsen — Lewis and Clark Trust

A routine MD5 audit of the journal corpus found 207 entries (6%) carrying duplicated content, concentrated in the pre-departure 1803-1804 window where primary sources are sparse. An earlier editorial generation pass had created template-based daily entries from a small number of representative narratives. This finding documents what was found, how it was fixed (replace with honest editorial notes, preserve dates as timeline placeholders), and what it suggests about how to read the rest of the archive and how to operate AI-augmented archives generally.

The Corps of Discovery’s Larder: Food and Trade Across the Route

Six observations about the expedition's diet drawn from 4,000+ food and animal mentions

May 12, 2026 · Ryan Abrahamsen — Lewis and Clark Trust

A quantitative look at what the Corps of Discovery actually ate and traded across the 28-month journey: meat dominance (1,052 deer + 757 elk + 369 beaver + ~300 buffalo entries), the geographic shift from Missouri-River deer abundance to Pacific-coast salmon dependency, the Bitterroot famine documented through entry-length compression, the underrecorded centrality of roots to the Corps's survival, and the shift from outbound diplomatic gift-giving to return-leg needful trade.

What the Journals Show When Read in Aggregate

Six patterns that emerge when 3,415 journal entries are read together

May 12, 2026 · Ryan Abrahamsen — Lewis and Clark Trust

Six patterns that emerge from reading the full 3,415-entry corpus in aggregate: how the journals are more interlocking than independent; how tribal mention density tracks dependence not respect; how Sacagawea is named in only 37 of her ~580 days of presence; how writing length is a stress signal; how the expedition wrote roughly 5.5 million characters across 28 months; and how the wildlife record is two records layered together. Each claim links to queryable data on this site.


Findings are working drafts that interpret the database's aggregate patterns. They are written by the site engineers, are not authoritative scholarly publications, and welcome correction. Contact ryan@terrain360.com with feedback or peer review.

Our Partners