____ ___ _ _ ____ _____ _____ | _ \ |_ _| | \ | | / ___| |___ | |___ | | |_) | | | | \| | | | _ / / / / | __/ | | | |\ | | |_| | / / / / |_| |___| |_| \_| \____| /_/ /_/
Every founder, researcher, and trader has a pile of projects rotting on their hard drive. These "failures" are the most underrated dataset in the world—they tell you precisely which paths don't work, in which month they break, and why.
PING77 digs up these corpses, uses LLMs to extract death modes, packages them into tokens, and feeds them into a startup risk prediction market. For the first time, those who failed earn yield by contributing the lessons.
No official deployment. No hosted version. No airdrop.
Just an open protocol spec and a minimum viable implementation.
Three sentences:
+----------------+ +----------------+ +----------------+
| DEAD REPO | | LLM | | DEATH MODE |
| pitch deck | ---> | extractor | ---> | tokens |
| post-mortem | | | | |
+----------------+ +----------------+ +-------+--------+
|
v
+----------------+ +----------------+ +----------------+
| CONTRIBUTOR | <----| MARKET | <----| NEW FOUNDER |
| earns fees | | prices | | queries |
| on matches | | TTL bets | | risk |
+----------------+ +----------------+ +----------------+
Hypothetical: Alice wants to build "AI customer support for pet stores." The protocol returns:
In the past, your failure was a sunk cost. The time, money, and energy you poured in—all buried.
PING77 turns failure into a cash-flowing asset: every time someone steps into the same trap, your death data is reactivated and pays you out. The project you killed ten years ago might still be sending you checks in year ten, because the next generation of founders is still making the same mistakes.
> For the first time, failure compounds.
Any honest protocol has to put its problems on the table first:
[PRIVACY]
Failures often involve unnamed cofounders, customers, investors
-> LLM extracts patterns only, not raw stories + selective disclosure
[FRAUD]
People will fabricate failures to farm tokens
-> On-chain anchoring of GitHub commits, domain registration, payments
[MANIPULATION]
Founders can bet on their own delayed death
-> Actually fine—it sparks survival instinct, society wins net
[COLD START]
No data, no liquidity in the early days
-> Seed with public post-mortems (CB Insights, etc.) first
This is a public thought experiment and reference implementation, not a product you can sign up for tomorrow. The code is on GitHub, fully open-source. If you want to run it: fork it, read the code, deploy it yourself.
You'll need your own LLM API keys, a prediction market framework (a Gnosis or Polymarket fork works), and the courage to seed the first batch of data with your own dead projects.
Three key technical problems in the protocol design. Click to expand and read in full.
Failure stories are inherently bad data. They're long, emotional, and full of self-justification—founders subconsciously put "the market wasn't ready" ahead of "I never found the right customer." Dump these stories into a vector database for similarity search and you'll get a pile of synonymous restatements of "startups are hard." The first-principles question for PING77 is: how do you distill this narrative garbage into structured signals a market can price?
We use a two-stage pipeline. Stage one is causal-chain extraction: an LLM uses a fixed schema to break each story into (trigger, mechanism, outcome, time) tuples. For example, "we burned 18 months on enterprise SaaS and died because the sales cycle was too long" becomes (enterprise sales, decision cycles > 6mo, cash exhausted, 18mo). This step compresses the narrative into a comparable causal skeleton.
Stage two is embedding-space clustering. We embed the mechanism field of every tuple into a 1536-dimensional space and run HDBSCAN density clustering. Each stable cluster core becomes a "death mode." A pattern is only minted as a tradable token if it's supported by at least N independent stories and its semantic distance variance falls below a threshold. This prevents the LLM from freelancing "novel ways to die."
LLMs have a dangerous tendency in attribution work: they love generating explanations that sound profound. Ask one to analyze a failure story and it'll invent three layers of psychological motivation and five structural market factors—even if none of that is in the source. We counter this with three mechanisms:
(1) Citation constraint: every extracted causal chain must carry a character-level offset back to the source span. Attributions that can't cite the original text are dropped.
(2) Multi-model cross-validation: the same story is run through Claude, GPT, and Gemini independently; only causal chains all three agree on are kept.
(3) Reverse predictive testing: extracted death modes are tested against a held-out set. If predictive accuracy falls below the random baseline, the pattern is flagged as an "overfit hallucination" and removed from the library.
Every existing startup post-mortem database (CB Insights, Failory, etc.) stops at "archive in human-readable form." They are museums, not markets. The real value of structured extraction is turning failure modes from stories into signals an algorithm can consume—and once they're signals, they can be priced, hedged, and compounded.
Traditional prediction markets like Polymarket and Augur are good at one thing: pricing whether some binary event will occur by a future date. "Will Trump win in 2024?" "Will BTC break $100k by year-end?"—these are classic yes/no contracts. But what PING77 needs to price isn't binary at all: whether a new startup will die isn't really a question (most do). The question is when it dies and how. That's a continuous-time, multi-mode survival process.
Our solution borrows the Kaplan-Meier estimator from survival analysis and bakes it into the market-maker logic. The market doesn't issue a single binary token; it issues a strip of time-bucketed contracts: DEAD_BY_3MO / DEAD_BY_6MO / DEAD_BY_12MO / DEAD_BY_24MO / SURVIVES_24MO+. The prices of these contracts must sum to 1 and be monotonically non-decreasing—any arbitrageur can free-money on a monotonicity violation, which automatically maintains the curve shape.
The immediate benefit: the market price is an implicit survival curve. You can directly read off "the market thinks this project has a 23% chance of surviving past month 12," no extra computation required. For founders, this is a free, money-backed risk health checkup.
For the market-making mechanism, we ultimately chose LMSR (Logarithmic Market Scoring Rule) over a Uniswap-style constant product (CPMM). The reason is cold start: CPMM needs initial liquidity to open, but every project in PING77 is its own market and most will never see significant volume. LMSR lets a market maker open at zero inventory with a fixed liquidity parameter b; the loss is bounded above by b·ln(N), which the protocol treasury can budget exactly.
The trade-off is that LMSR has higher slippage on large trades—but that's a feature, not a bug. It punishes whales trying to manipulate small markets.
The hardest part of a prediction market has never been pricing—it's settlement. "Is this project dead?" is a much harder question than "did Trump win?" because founders have a strong incentive to pretend they're still alive. We use a three-layer settlement mechanism:
(1) Objective signals: domain expired, GitHub repo silent for 90 days, no new Stripe subscriptions, official accounts stopped posting—any two triggers move the project into arbitration.
(2) UMA-style optimistic arbitration: anyone can submit a "this project is dead" assertion with a posted bond; if no one disputes within a 7-day challenge window, it settles.
(3) Founder-initiated settlement: the founder can self-declare death and receive a small "honesty dividend"—an incentive to exit gracefully rather than zombie along.
This is the most counterintuitive part of the entire PING77 design: we don't issue a governance token, we don't issue a contributor token, we don't issue any "participate-to-earn" instrument. Because the moment you have a token, the contributor's optimal strategy shifts from "write an honest post-mortem" to "write whatever pumps the token"—and these two things are almost always opposites. Once the incentive layer is contaminated by a token, the signal-to-noise ratio of the entire dataset collapses immediately.
So why would anyone contribute? The answer: a cash share of protocol fees. Every time a new founder queries risk or someone trades in a project market, a fee is generated. That fee is distributed back to the original data uploaders by Shapley value across the death modes their data informed.
The advantage of Shapley values is that their definition of "marginal contribution" is the unique fair allocation in a game-theoretic sense: if your story was among the first seeds for a death mode, your share is significantly higher than that of the 500th person to add a similar story. Conversely, if you contribute a unique, rare way of dying—even if you're the only contributor—you keep collecting as long as the market keeps querying it.
payout(contributor_i) =
Σ over all death modes m:
shapley_value(i, m) × fees_generated(m)
This allocation function is deterministic, auditable, and independent of any token price. Contributor income is tied directly to "how many people actually use the protocol," not "how hot the narrative is."
Money isn't the only incentive. Many founders are willing to share failures because they want to be seen by peers as honest and reflective—a form of social capital. PING77 captures this with a non-transferable reputation score: every time a death mode you contributed is hit by a new query or cited by a successful prediction, your reputation goes up. Reputation can't be traded, can't be airdropped, and can only be earned through real contribution.
High-reputation contributors get a few privileges: a higher fee-share multiplier (capped at 1.5x), free credits for querying new projects, and proposal rights in protocol governance. All of these are tied to "ability to use the protocol," not "ability to extract from the protocol"—which is the fundamental difference between governance tokens and reputation systems.
One might ask: if there's money at stake, won't people fabricate fake failure stories to farm Shapley shares? Yes. Our countermeasure is to make the cost of forgery higher than the expected payoff:
Every uploaded failure must come with at least two hard-to-fake timestamp anchors—a GitHub commit history (or equivalent verifiable repo), domain WHOIS registration, Stripe / Apple / Google Play revenue snapshots, or verifiable business registration records. These artifacts don't need to be public, but their hashes must be on-chain, and the LLM cross-checks temporal consistency during causal-chain extraction. A fabricated "we worked 18 months and failed" story without 18 months of real activity evidence gets dropped at the causal-chain stage and never enters the pattern library.
No complex on-chain identity system required, no KYC—you just need to make "forging a failure that passes verification" cost more than the expected payout from a real failure. In most cases, actually failing once is cheaper than faking it.
PING77 is an open-source reference implementation. There is no official hosted version. If you want to run it, you provide the infrastructure. This guide assumes you're comfortable on the command line, have a Linux server, and have touched Web3 development at least once. If not, read the three research notes first to understand the protocol itself before deciding whether to deploy.
A minimum viable deployment needs:
- Linux server (Ubuntu 22.04+ / Debian 12+) - 4 vCPU / 8GB RAM / 50GB SSD - Python 3.11+ / Node.js 20+ - PostgreSQL 15+ (for causal chains and pattern library) - Redis 7+ (for market quote caching) - Any EVM-compatible RPC (Base / Arbitrum recommended)
git clone https://github.com/PING77INC/PING77.git cd ping77 make bootstrap # install python / node deps make db-init # initialize postgres schema cp .env.example .env
Edit .env with the required keys:
ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... GEMINI_API_KEY=... DATABASE_URL=postgresql://... REDIS_URL=redis://localhost:6379 RPC_URL=https://mainnet.base.org DEPLOYER_PRIVATE_KEY=0x...
Note that all three LLM keys are required—the protocol depends on multi-model cross-validation to fight hallucinations (see Research Note 001). If you only configure one, the extraction pipeline will refuse to start.
The market layer of PING77 is a set of LMSR contracts; each project gets its own market instance. Deploy the factory contract first:
cd contracts forge build forge script script/DeployFactory.s.sol \ --rpc-url $RPC_URL \ --broadcast \ --verify
Write the resulting factory address back into .env as FACTORY_ADDRESS. This is the entry point for all subsequent market creation.
The extraction pipeline is the protocol's core service—it turns uploaded failure projects into structured death modes. It has two long-running processes:
make extractor # causal-chain extraction worker make clusterer # embedding clustering worker (runs every 6h)
These two processes are not cheap on LLM tokens. Rough estimate: $0.15-$0.40 of API cost per failure story (depends on length and multi-model verification overhead). Test the full pipeline on a small dataset (10-20 stories) before opening up public uploads.
make api # start FastAPI backend (default :8000) make web # start Next.js frontend (default :3000)
At this point, opening http://localhost:3000 should show an empty PING77 instance: zero projects, zero death modes, zero markets. This is the protocol's "genesis state."
An empty protocol is meaningless—with no death modes, new founders' queries return nothing and the market won't bootstrap on its own. We recommend pulling in a batch of public-source seed data before opening up public uploads:
make seed-public # This pulls from public post-mortem sources: # - CB Insights startup failure database # - Failory public cases # - Hacker News "Show HN: My failed startup" tag # - Indie Hackers failure threads
The seed script takes roughly 2-6 hours. When it's done you'll have around 300-500 initial death modes—enough to put the protocol in a "queryable" state.
Once everything is in place, flip the upload switch:
UPLOADS_ENABLED=true make restart-api
When your first real contributor shows up, buy them a coffee. They're doing what 99% of people refuse to: making their most embarrassing post-mortem public.
Q: Can I skip deploying the market-maker contracts and only run extraction?
Yes. Set MARKET_LAYER=disabled. The protocol becomes a pure failure knowledge base with no pricing market. This is a reasonable choice for academic research or private deployments.
Q: What if I want to use my own modified LLM prompts for causal-chain extraction?
All prompts live under extractor/prompts/ as YAML files and can be edited directly. Be aware that significantly modified prompts may produce patterns that aren't compatible with other instances—cross-instance data exchange will break.
Q: What does deployment cost?
Server ~$40/month (Hetzner / DigitalOcean) + LLM API ~$0.15-0.40 per extraction + on-chain gas (~$0.5-2 per new market on Base). A small instance running 100 projects costs roughly $80-150/month.
Q: After I fork the protocol, how does my instance interoperate with others?
This is an unsolved research problem. The current PING77 is isolated—each instance maintains its own death mode library and markets. A cross-instance pattern recognition protocol is on the v0.2 roadmap, based on content addressing and federated learning ideas, but isn't implemented yet.
PING77 is not a plug-and-play product. It's a protocol spec and reference implementation. Treat it more like a research repo on Hugging Face than a SaaS. If you get a meaningful instance running—at any scale—come share your deployment story in GitHub Discussions. That's exactly the kind of "failure/success" data this protocol needs most.