Blockchain is Watching You: Profiling and Deanonymizing Ethereum Users

How private are cryptocurrencies?

Our lives are becoming increasingly digital. Year by year, we conduct more and more our activities online. Nowadays, we can buy and sell goods online. We consume advertisements and movies online. We work, make friends and maintain our relationships with our loved ones online. Really soon, our money will also become digital. However, digitizing money comes with a lot of difficulties.

Most of these difficulties were solved by Satoshi Nakamoto, who designed, implemented and launched Bitcoin as a response to the financial crisis of 2008. Bitcoin is a decentralized, peer-to-peer, internet-native money, which does not rely on any central entity, let it be government or financial institution. Bitcoin applies a decentralized and public ledger, called blockchain, which stores the balance of every participant. Each participant can create an identity simply by generating a pair os a public and a private key. Hence, the identifier of a user is her public key. This means that once one learns the public key of a user, then they could learn their past transactions or Bitcoin balance.

Cryptocurrencies and privacy

One crucial pain point and central challenge of cryptocurrencies is financial privacy. Say, if you buy a slice of pizza with cash, then the cashier will only know that  you at least own 10 USD, which is not that much of a privacy leak and necessary anyway to complete the transaction. However, in most cryptocurrencies, after the completion of this purchase, the cashier could look at the sender’s address and see their account balance and entire transaction history. This is certainly unacceptable for financial privacy. As another motivating example, imagine that a car manufacturer company would pay its tire supplier in cryptocurrencies. Knowing the address of the car manufacturer company just by looking at the blockchain, one could see monthly outgoing transactions to the tire supplier. Clearly, the amount of these transactions is confidential and can be considered as business secret. However, the transparent nature of the blockchain makes it trivial for anyone to track these transactions once they get to know where they need to look. 

Since financial privacy is essential for almost all cryptocurrency use-cases, it is imperative to understand how much privacy cryptocurrencies can provide to their users. And if they do not provide sufficient privacy, then how we can somehow enhance the privacy of cryptocurrency users?  

Ethereum, smart contracts and the account-based model

As of today, Ethereum is the most popular cryptocurrency by usage. Unlike Bitcoin, Ethereum is much more than just a decentralized, censorship resistant currency. It allows its users not only to send transactions, but also to execute smart contracts, algorithms capable of validating and enforcing virtually any constraint, in a transparent and decentralized way. Since the blockchain is immutable, the terms of the smart contracts cannot be changed once they are deployed on the blockchain. This is beneficial, as no single party can unilaterally change the terms of any single contract. Imagine, Alice and an Insurance company creates an insurance contract for Alice, which pays Alice certain amount of currency if her airplane is delayed by more than an hour. However, the Insurance company could wangle about the arrival time of the airplane. Smart contracts allow no data modification, which makes these contracts trustless. 

Beyond insurances, smart contracts found many more use cases (e.g. crowdfunding, electronic voting, stablecoins, prediction markets, DeF: decentralized finance) and seems to be a potential source of many more applications we cannot even think of today. Hence the popularity of Ethereum. 

Profiling Ethereum users

In Ethereum, privacy and usability seemingly goes against each other. Having a single account makes the platform more easy to use, since one does not need to generate new addresses for each incoming transaction. However, from a privacy point of view, having a single address is also risky. Whenever this single address is exposed, one loses financial privacy permanently.

In our new research paper, we showed how easily one can profile owners of single, monolithic addresses with many outgoing transactions. For instance, the timestamp of outgoing transactions determine the time zone of the account owner. We identified several other, so-called quasi-identifiers of users. Even though quasi-identifiers such as time zone on their own do not deanonymize users, combining multiple quasi-identifiers potentially could. The transaction network itself turned out a very strong quasi-identifier: the structure of the connections between users can potentially identify them by using machine learning algorithms.

Deanonymizing privacy-enhancing tools

The lack of financial privacy on Ethereum is not entirely new to many in the cryptocurrency community. Previously, many privacy-enhancing technologies were proposed and implemented. One popular technique to enhance financial privacy in cryptocurrencies is called mixers. In a cryptocurrency mixer, users deposit equal amount of money into a smart contract. After certain amount of time, users can withdraw their money from the contract without disclosing which deposit belong to their claimed withdrawal. This is achieved by using advanced cryptographic techniques like zero-knowledge proofs. Mixers enhance privacy of users, because one cannot be sure which depositor corresponds to the current withdrawal. We usually refer to the set of depositors as the anonymity set of a withdrawal. Mixers enable the withdrawer to look indistinguishable in the anonymity set. However, careless usage can easily reveal the links between depositors and withdrawers, hence forfeiting any gained financial privacy not only for the careless user but also for others in the anonymity set as well.

In Ethereum, the most popular mixer is called Tornado Cash. In our work we were trying to find links between depositors and withdrawals of Tornado Cash mixers. Even though we did not break any cryptographic tools, we were able to uncover a several variants of careless user behavior. A simple observation is that if users apply the same address for deposits and withdrawals, then effectively we can link these corresponding deposit-withdrawal pair and remove the deposit from the anonymity set. Careless users could reveal link between their deposit and withdraw address if they send a transaction between these addresses. 

Furthermore, not only negligent mixer usage decreases the gained privacy. We observed that profiling techniques could also reduce the privacy gains of cryptocurrency users.

Fingerprinting cryptocurrency users

Finally, we identified a potential active attack against user privacy. An attacker could maliciously fingerprint a user’s account balance. Using this fingerprint an adversary could track all the financial activity of the targeted user even if the unsuspecting user changes addresses during the course of the attack. 

Orwell’s 1984 haunts Ethereum users

As many thought before, we showed clearly in our research paper that Ethereum’s privacy guarantees are far from optimal. Even if one applies advanced cryptographic tools to preserve privacy, negligent users could easily forfeit their financial privacy. Hence, it is our responsibility, community members and academic researchers, to educate users about these privacy risks unless we want to end up in an Orwellian dystopia. Even worse than an Orwellian dystopia, because in this dystopian world not only a single entity would be able to see what others are doing, but everyone else, in full transparency, without any privacy whatsoever. 

Preprint: https://arxiv.org/pdf/2005.14051.pdf