The Spindl take on Web 3 privacy

Or, drinking the Web 2 user milkshake

May 16, 2023

Text within this block will maintain its original spacing when published

Sunlight is said to be the best of disinfectants; electric light the most efficient policeman.   
                                       - Louis D. Brandeis, 'What Publicity Can Do' (1913)

The new Internet that’s coming won’t be a fully Web 3 phenomenon: it’ll be a hybrid of legacy Web 2 and novel blockchain functionality running under the hood (often without the user even realizing it).

Analytics and attribution for Web 3 will be much the same: normal browsers and mobile apps as the ‘top of funnel’ onboarding for a user experience that ends in an on-chain action like a Defi trade or NFT purchase.

Any technology that aims to capture where users are coming from, how much they spend or engage, and reward the entities that drove those downstream actions, will have to span two fundamentally different internets. Spindl is the pioneer in this technology; we’re also pioneers in the privacy implications as well. Here we’ll discuss the various flavors of Spindl privacy and how they work in practice.

Firstly, a necessary digression:

Analytics and attribution are often conflated, but the two functions are quite different, and so are their uses of data. Analytics measures the user journey through a very important but narrow sliver of the overall user funnel, namely, the experience on your app. All analytics needs is a daily snapshot of where unique users went on your site, and how they moved around the various user flows (presumably presented in an enlightening way). That’s it.

The goal of analytics is to highlight, for example, how only 10% of users who reached your landing page clicked to a product page, perhaps due to a poor call-to-action (CTA). A leading analytics provider for companies conscious about user privacy is Plausible, which intentionally changes their device-side identifier daily. Plausible can get away with that transient identity because they’re looking at fleeting, real-time user journeys on a single service. (We’re jumping ahead of ourselves, but one version of Spindl works just like Plausible…more in a bit.)

Attribution is looking at user behavior on a much grander scale, from the Tweet posted a week ago to the user monetizing on your service this very day. That user journey spans the entire Internet, which now features both Web 2 and Web 3 components. If the mass-adoption crypto future is Web 2.X, where 2 < X < 8, then the user growth machine will have to expand into that legacy Web 2 world to some degree. It won’t work very well otherwise.

Let’s take a very simple and common user flow as an example:

A user sees a Discord post from an influencer about a new protocol and clicks through to the landing page
Two days later, they see a tweet reminding them of the protocol, and they click through to their announced new product and sign in with a wallet
Something happens in the market—ETH price starts spiking—and they enter a search into Google to find the protocol again and finally make a trade, turning into a real user

The flow is as so:

This is the reality of Internet usage. It’s not some clean click-engage-spend flow happening instantaneously: it’s a chaotic mix of events spanning two different forms of identity. One involves browsers and mobile devices and the other involves a global decentralized computer (the Ethereum Virtual Machine) with a global namespace of wallets. It’s the attribution system’s job to figure out how a user wandered from the former and did something on the latter.

How does Spindl do it?

In two ways, each reflecting a different setting of the privacy knob:

The standard way. This is the exact same way it happens across the entire Internet now, whether Google Analytics, Twitter, or large attribution platforms like Branch or AppsFlyer: a combination of cookies and calculated device identifiers roughly unify a user journey across many different apps and experiences. Spindl goes a significant step further, and indexes like crazy on the blockchain side: together, this means we can tie that Discord post and Google click to that on-chain action. The Spindl attribution machine picks a winner among upstream events, and all dashboards and payouts reflect that winner-take-all contest.
The ‘privacy safe’ way. We intentionally assign a transient device identifier to any Web 2 user touchpoint, refreshing the pseudonym regularly and essentially deleting any persistent data about that device. This is how Plausible and other privacy-aware services popular among some Web 3 projects work.

In both cases, the client-side identifier exists for no other reason than to maintain continuity of the user journey toward an on-chain action. In Spindl’s case, that identifier isn’t linked to anything else like ad exchanges or data warehouses. It never serves any other function than to join that Web 2 click to this page-view to that NFT purchase that happened just now.

That’s it.

There’s no other monkey business going on. No data ever makes it out of Spindl, other than to the dApp itself who’s the real custodian of the data and anyhow has already touched the user’s browser. The referral rewards that Spindl manages (and which reflect the attribution logic that’s happening here) all happen on-chain, so everyone can see what got paid out to whom.

It’s all really quite simple and transparent.

If the project using Spindl opts for the more privacy-preserving version of our front-end SDK, then metrics like daily visits will be accurate, as well as attribution for user journeys shorter than a day. Anything longer than same-day will be non-functional.

This isn’t necessarily ideal for Web 3, mind you.

Take the simple example diagrammed above: In a ‘first-touch’ attribution model1, the initial influencer should be credited as the party that drove the user on-chain, as they were the first to nudge the user in the app's direction. However, with hamstrung identity, the causal connection between that click and the eventual trade is broken.

Instead, Google (or some other publisher that deserves far less credit) is crowned the winner by the attribution model as that's all the model will see. The influencer won’t get their referral bounty, and very possibly some other form of media will get credit. Was that a win for Web 3? We tend to think 'no'. Nothing would perpetuate the old Web 2 world we’re all trying to escape more than dominant publishers like Google getting all the (undeserved) credit for Web 3 growth.

Ultimately, it’s up to the company using Spindl to decide on their approach to privacy, as well as details like attribution model, churn windows, and all the other parameters that dictate the laws of physics inside digital media. We as the attribution system are but a data Switzerland trying to discern digital truth for our customer.

It’s early, but we expect very different privacy cultures to emerge in different sub-areas of crypto: Defi is already opting for one very privacy-aware stance, while gaming follows more of a Web 2 standard. One workaround (and we’re foreshadowing future product releases with this; stay tuned) might be the fully Web 3 version of all this: only ever use pseudonymous wallet IDs in the attribution machinery, foregoing client-side identifiers (and heated privacy debates) altogether. That would be the true degen way to build this.

One thing we do feel strongly about:

If Web 3 is to one day have a billion users on-chain, it will only do so by drinking the Web 2 user milkshake and exploiting the old Internet’s publishers to bootstrap its own growth. And the only way to do that is to closely measure which of those (legacy) properties drive strong blockchain adoption.

There are no solutions in this world, only tradeoffs. While crypto projects might opt for increased privacy, they should also realize that their ability to grow their users in a data-driven way will also be compromised thereby. At an even deeper level, a rejection of any and all Web 2 growth infrastructure, in the name of keeping Web 3 an oasis from those legacy concerns, may mean just that: Web 3 remains an oasis on a sparsely-populated island.

Subscribe for more spicy Web 3 growth takes.

We’re hiring!

An oft-overlooked aspect of any media ecosystem is the regnant attribution model that determines capital ‘T’ Truth inside that ecosystem. A good chunk of Google’s trillion-dollar valuation stems from imposing a ‘last-click’ attribution model that favors its own search engine monopoly, wherein most users’ last stop on the purchase journey is a Google search (even if that’s not why they’re really buying). Web 3 is still too nascent, but there will one day be apocalyptic battles over which model should be used to measure the positive impact of this publisher or that wallet on downstream monetization on-chain.

Discussion about this post

Ready for more?