Someone used my IPFS gateway for phishing

Thu 4 October 2018

At 02:43 this morning, I received an abuse complaint email. It was sent by PhishLabs to DigitalOcean, and DigitalOcean forwarded it to me.

During an investigation of fraud, we discovered a compromised website ( that is being used to attack our client and their customers.

First detection of malicious activity: 10-02-2018 19:06:26 UTC Most recent observation of malicious activity: 10-02-2018 19:28:25 UTC Associated IP Addresses:

No “compromised website” here! The machine in question is my public IPFS gateway, which I run for the purpose of providing a convenient introduction to hardbin.

Around the same time that this email was forwarded to me, DigitalOcean disabled the network interface on my VPS in order to stop the phishing attack from working. Fair enough, can’t really expect them to do any more than that.

When I woke up and read this email, I visited the URL to verify that it is phishing, and immediately blocked the offending IPFS hash (QmUyT6vFaGxgcyD7o7eX4eifvV5fVxxKyfSfNC8oP6JG1B) in my nginx config, and replied to DigitalOcean at 08:38 explaining that I have blocked the hash and could they please switch my networking back on.

I decided to take a look at what exactly this phishing site was doing.

So it’s just a standard Google sign-in phish. Probably the victim receives an email telling them that they need to login to their Google account in order to resolve a problem, and the hyperlink goes to the URL given above.

You’ll note the URL has some random-looking text in the URL fragment:


so our first guess might be that some of the content is encrypted to avoid detection, and this is a key that then decrypts the content.

The HTML source is odd, however. It includes a mention of It turns out that “” is a tool that allows you to encode the entire content of a small web page in the URL fragment! And indeed, that is what is being done here.

The random-looking text in the URL fragment is actually the content of the page, compressed with LZMA and encoded with base64. If we reverse this, we get:

$ echo "XQAAAASQAAAAAAAAAAAeGgqG70rZWojk3J5AyyC5pdYwyJfWHIBHUGivJcLcQVE/oHjdV3pvUtyMRW499SqeMLLi7x4fSaIAi0Izhe+oGbBKFh28KmaEAtiTUDAZuJULtdxsBAqWIUpOddytlgEdBqFdM6cEiQ44qWXdB1OipKFu0hVNKIAA" | base64 -d | lzcat

It seems all this is doing is redirecting to a file called GMA.html hosted in somebody’s OneDrive folder. Using Firefox “Network Tools” verifies that the site is indeed fetching content from this OneDrive URL (it’s worth verifying this even though it is pretty obvious, because otherwise a sneaky attacker could slip any plausible content into the URL fragment in order to frame an innocent third-party).

I also compared the HTML content that my server was returning to the content generated by “” and it was identical, which proves there is nothing else up the attacker’s sleeve.

So this is an interesting conundrum: since my server isn’t actually returning any phishing content for the requested URL, should I even be blocking it? The content returned by my IPFS gateway for this URL is just a generic HTML renderer. My server is no more complicit in the phishing campaign than the user’s web browser is! I reported my findings to DigitalOcean at 09:13 and asked if they would be happy for me to unblock the URL. There is certainly an argument that I shouldn’t unblock it, since blocking it demonstrably prevents the phishing attack. But it is also hard to argue that completely benign content should be blocked just because somebody somewhere is phishing.

Interestingly, the source of the OneDrive URL actually checks whether the ?z=a parameter is present. If so it deobfuscates the page content using atob, and if not it redirects via I’m not sure why.

I looked up the OneDrive URL in VirusTotal:

Only 3 of the 70 anti-phishing companies are actually classifying this URL as a phish. That is surprisingly poor.

In the past I have been very impressed with DigitalOcean’s abuse response staff. I used to run a Tor exit node on DigitalOcean and they were extremely patient with me, so I was expecting my networking to be switched back on promptly. However, by 7pm (nearly 11 hours after I blocked the URL in question) I had still received no further correspondence from DigitalOcean. And my networking was still switched off. This was very annoying.

My friend Miles Armstrong suggested taking a snapshot of the droplet and restoring it to a new droplet, and switching the hostname over to point at the new droplet. I thought that was a splendid idea, so I did it, and 15 minutes later was back online. Hurrah! I did inform DigitalOcean about what I’d done, but they still haven’t replied about that either.

So the problem is resolved for now, from my perspective (although I am still blocking a completely-benign IPFS hash, which I don’t feel good about). It’s worth noting that the completely-malicious GMA.html file is still not blocked by OneDrive (although their hosting provider doesn’t appear to have switched their networking off).

I am surprised that the phisher decided to host the actual content on OneDrive instead of putting the entire phish in the URL fragment. Given that OneDrive haven’t been very proactive in taking it down, it seems to have worked out, but the site would certainly be harder to take down if it only existed in the URL fragment.

And if you know a hosting provider that is less likely to switch your networking off, I’m all ears.

The Art of Subdomain Enumeration


In this blog post we will set you up with all you need to know about a dangerous art! Subdomain enumeration is an essential part of the reconnaissance phase in the cyber kill chain. Cyber attackers map out the digital footprint of the target in order to find weak spots to gain for example access to an internal network. Already know enough? Check out for powerful reconnaissance solutions, or subscribe to our newsletter on the button below. Still curious? Read on!

Sweepatic artwork

Subdomain enumeration is the process of finding valid (resolvable) subdomains for one or more domain(s). Unless the DNS server exposes a full DNS zone (via AFXR), it is really hard to obtain a list of existing subdomains. The common practice is to use a dictionary of common names, trying to resolve them. While this method is effective in some cases, it doesn’t include subdomains that have strange names. Another approach is to crawl the second-level domain in order to find links to subdomains (faster approach is to use a search engine directly).

After completing the subdomain enumeration process, the attacker finds as one of the subdomains in the target’s DNS zone. The attacker is enriching this finding up to the web application layer and finds out that the blog is using WordPress as a content management system. The attacker then runs wpscan in order to find WordPress vulnerabilities. Fortunately, the target’s WordPress instance uses a vulnerable plugin which an attacker is able to exploit, gain access to the environment and pivot further into the network. This example might seem a bit exaggerated, however, this is exactly what happened in The Panama Papers case.

Let’s present the most popular open-source tools and techniques for performing subdomain enumeration. Here we go:

Zone transfer

The most simple and basic technique is to try an AXFR request directly on the DNS server:

dig AXFR

A zone transfer is used to copy the content of the zone across primary and secondary DNS servers. The best practice advises administrators to allow AXFR requests only from authorized DNS servers, so the above technique will probably not work. But if it does, you have found a goldmine.

Similar to zone transfer, there is a so called NSEC walking attack, which enumerates DNSSEC-signed zones.

Google Dorking

Time to use Google! Luckily, you can use various operators to refine you search queries (we also call these queries “Google dorks”). As mentioned previously, many subdomains can be found using crawling the target. Google (and also other search engines like Bing) does it as a byproduct of its primary intention. We can use the site operator to find all subdomains that Google has found:

Rapid7 DNS dataset

Rapid7 publicly provides its Forward DNS study/dataset on repository. The DNS dataset aims to discover all domains found on the Internet. While they do a very good job, the list is definitely not complete. You can read more about how they compile their dataset here. After downloading the latest snapshot, we can run jq on it to find subdomains:

zcat snapshop.json.gz | 
jq -r 'if (.name | test("\\.example\\.com$")) then .name else empty end'

jq tests the regular expression “ending with” to find all subdomains in the dataset.

DNSDumpster is a free online service which is using exactly this technique.

Subject Alternative Name

Subject Alternative Name (SAN) is an extension in x.509 certificates to provide different names of the subject in one certificate. Companies often generate one certificate for multiple subdomains to save money.

We can look into certificates to hunt for subdomains in SAN’s using two different sources: is an interface to a subset of data published by The good part is, that it allows to search for keywords in certificates and thus potentially reveal new subdomains: is an online service for certificate search provided by COMODO. It uses a different dataset than Censys, but the principle is the same: find subdomains in certificates.

It is good to note, that although some domains respond with NXDOMAIN, they still might exist on the internal network. Administrators sometimes reuse the certificates for public servers on their intranet servers…


One of the most popular open source tools for subdomain enumeration, is called Sublist3r. It aggregates output from many different sources, including:

  • Google
  • Bing
  • Virustotal

While the data is correct in most cases, you might encounter non-resolvable subdomains (domains responding with NXDOMAIN). This is because Sublist3r relies heavily on passive data and it doesn’t validate whether the found subdomains really exist.
Sublist3r also uses a standalone project called subbrute. Subbrute is using the dictionary of common subdomain names in order to find a subset of subdomains that are resolvable.

To use it, simple run:

python -d

and the list of subdomains of will be presented to you.


Another Open Source intelligence gathering tool, is called theHarvester and finds e-mail addresses on target domains as well as subdomains and virtual hosts. However, compared to Sublist3r, it provides fewer subdomain results. You can run theHarvester using the following command:

python -d -b all

Smart DNS Brute-Forcer (SDBF)

Subdomain enumeration tools often include a list of common subdomains that they try to resolve. This approach can be extended by using Markov chains in order to discover a subdomain name structure (e.g. you have www1, it is likely that www2 will exist and so on). There is a research paper from Cynthia Wagner et al explaining this technique in greater detail. The results produced by SDBF are far better than simple keyword enumeration of subdomains.


Periodically checking what subdomains can be discovered by your cyber adversaries provides a great input towards vulnerability assessment teams and other tactical cyber teams in the organization.

Running frequent reconnaissance against your environment will provide you with greater visibility to find forgotten subdomains. The latter can expose your environment and the company to a wide range of threats like subdomain takeover or even full compromise as seen in the example at the beginning of this blogpost.

Even still, too many times have we witnessed a false feeling of security in cyber teams. By only partially understanding the dimension of the company’s digital footprint, internal teams are not always aware of their complete exposure and fail to minimize the risks.

But don’t worry, we are here to help!

Through our Sweepatic reconnaissance platform, we offer you and your company a unique and easy to use solution, that will help you understand, monitor and reduce your dynamic digital footprint to minimize the risk of compromise.

The table below shows a benchmark we recently (data from 24-04-2017) did and contains subdomain enumeration results from Sweepatic and various other tools on the domain:

Subdomain enumeration comparison table

An Innovative Phishing Style

A few weeks ago, I added one of the many scammers trying to phish people on Steam. Usually, I block them after they drop their phishing website link but this particular website was pretty innovative (at least for me) in its attempt.

The chat seemed straightforward, the scammer wanted to give me an obviously profitable trade (they did keep trying to get me to add them on Discord for some reason). Near the end of the “trade” discussion, they asked me to log in to a convenient Steam backpack pricing website so they could get an idea of how much my stuff was worth.

The site in question was our fancy phishing website, The website was essentially a copy of a legitimate Steam trading website, Screenshot Screenshot (Good on them for the warning!

The website was hosted nicely on CloudFlare and the domain name registered with Namecheap. They even opted for the CloudFlare certificate! No room for corner cutters in this phish market.

Now, on to the trick. The site had a little JS chunk which would open up a pop up saying that the server is under high load and asking you to login with Steam to access the site.


Logging in with Steam launches a pop up opening the Steam website so you can authenticate via OpenID.

Login with Steam Pop-up

I was expecting this to be a scam so I was adequately confused looking at the absolutely normal looking pop-up. I tried Chrome DevTools to check out what was happening to make it look so good. Surprisingly, I ran into an anti-debugging script. Definitely not what I expected from a run of the mill phishing website but combined with the curious pop-up, this was looking well built.

I managed to extract and partially deobfuscate some JS which was trapping the debugger but it didn’t seem like the whole thing. I moved on with some good old fashioned viewing page source action… which was a heap more of obfuscated JS.

Almost giving up hope, I accidentally hovered over the Chrome icon in my task bar and just happened to realize that the “pop-up” did not result in two instance of Chrome in the task bar. The whole thing was just a drawn up window inside the phishing website! They had even made some clickable buttons for the Chrome UI elements. This was confirmed by trying to right click on the title bar area of the pop-up, which opened up the right click context menu of a web page instead. Still a mystery to me as to why some page like this would want to add anti-debugging measures though.

Classic phishing detection – Right click the title bar

Definitely a unique phishing website experience for me, but on further googling some of the more interesting strings (“debug322”, mainly [possible 322 reference?]) of the JavaScript code, there were truckloads of such websites which seemed similar but I couldn’t confirm since they weren’t actually live and I was only running into cached versions of the pages. Nevertheless, a fascinating journey.

If anyone is able to deobfuscate and make sense of the JS snippets, I would love to know what they were doing. As far as I could tell, the debugger trap was basically calling the debugger function if it detects a running debugger. The other larger JS block I’m completely clueless about. However, there were a few more fun things in the mix!

Although, the folks had disabled the scam warning you may have noticed in the original screenshot from popping up on the phishing website, they didn’t bother to change it to not say

Always check the URL!

Most of the HTML source was directly lifted from, but they did change the logo on the top left… by switching to an imgur link!

Not to mention all these other assets generously hosted by Steve (Hi Steve!).


The above domain also just happens to contain a whole host of similar images, probably enough material for another blog post some day.

Update: Turns out this is a pretty well done picture in picture attack. Thanks internet strangers!

Are Index Funds Communist?

There won’t be much left to do once the investment robots perfect capitalism.

I have been half-joking for a year and a half that maybe index funds should be illegal, but here is an almost entirely serious claim from Sanford C. Bernstein & Co. that they are worse than communism:

In a note titled “The Silent Road to Serfdom: Why Passive Investing is Worse Than Marxism,” a team led by Head of Global Quantitative and European Equity Strategy Inigo Fraser-Jenkins, says that politicians and regulators need to be cognizant of the social case for active management in the investment industry.

“A supposedly capitalist economy where the only investment is passive is worse than either a centrally planned economy or an economy with active market led capital management,” they write.

The basic idea is straightforward. 1  The function of the capital markets is to allocate capital. Good companies’ stock prices should go up, so they can raise money and expand. Bad companies should go bankrupt, so that their resources can be re-allocated to more productive purposes. Analysts should be constantly thinking about whether companies are over- or underpriced, so that they can buy the underpriced ones and sell the overpriced ones and keep capital flowing to its best possible uses. 

But when those thoughtful active analysts are replaced with passive index funds, the market stops serving that function. Whatever the biggest company is today will remain the biggest company tomorrow, and capital will never be allocated from bad uses to good ones. Indexing is cheaper, yes, but that’s because active management has positive externalities, and if no one will pay for it, those benefits will disappear. 2

There is a lot of debate over whether this is actually how it works. For one thing, public stock markets are not really a mechanism for raising capitalany more. 3 But more fundamentally, there is an alternative view that the rise of passive investing will improve capital allocation, because bad active investors will be driven out but good ones will remain. 4 The passive investors can’t influence relative prices, since they just buy the market portfolio, meaning that the fewer but better active investors will continue to make the capital allocation decisions. On this view, lower returns to active management are a sign that prices are more efficient and capital allocation is getting better5  

Fraser-Jenkins et al. don’t buy it. 6  Their worry is that the growth in passive and quasi-passive products — not just true index funds but “smart beta” funds that invest based on historically predictive factors — has caused markets to become more correlated, as all the passive funds buy all the same stocks for the same reasons. They are not alone here; I like to quoteNevsky Capital’s final investor letter:

In such a world dominated by index and algorithmic funds historically logical correlations between different asset classes can remain in place long after they have ceased to be logical. 

“By definition,” write Fraser-Jenkins et al., “passive flows of capital, given that they seek to emulate or replicate what has already occurred must be backward looking.” And a market that is more correlated, they argue, will do a worse job of allocating capital. 7

Anyway it is a fascinating and delightful note but now let’s talk about something slightly different.

Imagine you are in charge of the economy. You decide how much of everything people should produce, and what the prices should be. It is hard! It’s hard to find out how much of the different things people want, and how much everything costs to make, and how to motivate people to make things, and so forth. 

There are three basic approaches that you can take 8 :

  1. You can be bad at it. You can just announce prices and quantities, and get them wrong, and there will be shortages and bread lines and corruption.
  2. You can be good at it. But I just said it was hard, so being good at it probably requires you to have a really fancy computer that takes lots of data and crunches it to decide on prices and quantities and so forth. 9
  3. You can have a market. You can just think of a market as a giant distributed computer for balancing supply and demand; each person’s preferences are data, and their interaction is the algorithm that creates prices and quantities. 10  

Choice 1 is more or less actually existing communism. 11 Choice 3 is more or less capitalism. 12 Choice 2 is more or less a fantasy, but the problem of figuring out how the computer would work is sometimes called the “socialist calculation problem,” and there is a shockingly wonderful novel about it, Francis Spufford’s “Red Plenty.”

The capital allocation problem is a subset of that more general resource allocation problem, 13 and has similar answer choices: You can have clunky central planning, or you can have a market where investors compete to buy securities and thus set prices, or you can have an ideal but as-yet-undiscovered computer do the allocating. Or I guess pure indexing — everyone passively throws money at everything that there is, with no judgment at all — is an imaginable fourth answer, and is strictly worse than the others. 14  

But the Bernstein note isn’t really about pure passive indexing, just buying and holding the market-cap-weighted portfolio of global financial assets. Not many people do that anyway. Instead they buy a S&P 500 index fund, say, which is a market-cap-weighted portfolio of large-cap U.S. stocks — itself a capital-allocation decision. Or they buy a smart beta fund, which is a portfolio of stocks chosen and weighted based on some set of factors that have historically determined performance. Fraser-Jenkins et al. note “the bizarre situation that there are more indices than there are large cap stocks.” They can’t all just be throwing money at every stock with no judgment. There is something a bit more going on there.

One way to think about them is that they are all crude algorithms for picking the best companies to allocate capital to. True, diversified, market-cap-weighted index funds are the crudest algorithm. They essentially assume that the companies that were good yesterday will probably be good tomorrow. This is not entirely true, of course, but it’s true enough to be useful, or at least to outperform most human money managers most of the time.

But there is no need to stick with such a crude algorithm. You might notice that, historically, some factors have been associated with outperformance. Companies with low price-to-book ratios might have outperformed companies with high price-to-book ratios, most of the time. So you might invest in a smart-beta value fund that focuses on buying stocks with low price-to-book ratios. If you do that, you are making a capital-allocation decision; you are giving money to companies that you think the market undervalues, whose fundamental performance should justify higher prices.

Even that is pretty crude, though. You could get more nuanced, and invest in a fancy quantitative hedge fund that digs through mountains of data to find signals that have historically predicted stock prices, and then applies those signals to future prices. 

As these algorithms get more complicated, they also get more expensive to implement. S&P 500 indexing is basically free. Smart beta is more expensive, but everyone expects the price of smart beta to eventually fall to, basically, free. 15 Fancy algorithmic hedge funds are expensive, but they are perhaps being disrupted by hobbyist sites that let random data scientists build algorithms to predict future stock prices, and then allocate money to the best-performing ones. 

In Fraser-Jenkins’s taxonomy, all of these algorithms more or less count as “passive,” because they all look at historical correlations to predict future returns, as opposed to an “active” style that attempts to predict future returns based on a fundamental analysis of real economic factors. 16  But that distinction is not particularly clean. A smart-beta fund focused on the value factor is in a sense doing very crude fundamental analysis: It looks at a company’s financial statements, compares them to its stock price, and buys stocks that seem to be mispriced based on their fundamentals. My Bloomberg View colleague Noah Smith wrote today about two financial economists who published a more sophisticated fundamental-investing algorithm — one that uses 28 items from financial statements — that seems to outperform the market. You could implement that algorithm “passively,” as it were, but it seems to be making investing decisions based on where it sees fundamental mispricing, not just on past stock prices. 

And of course any human active investor is mostly relying on historical correlations and pattern-matching to make predictions of future fair value. You invest in Twitter because you think it will be the next Facebook, or you don’t invest in Twitter because you think it will be the next MySpace; you go long oil companies because real-economy conditions remind you of the last time oil rallied, or you go short because those conditions remind you of the last time oil tanked. Human investors reason by this sort of informal empiricism; robot investors just formalize it.

One broadly plausible thing to expect is that, in the long run, the robots will be better at this than the humans. 17 Another broadly plausible thing to expect is that, in the long run, the robots will keep getting better at it. 

What does it mean to say that the robots will keep getting better at it? Surely it means that robots will become more accurate at allocating capital to businesses that will perform best in the future. They will make those decisions partly by looking at the prices of financial assets (correlations among stock prices), and partly by looking at fundamental financial data (correlations between companies’ stock prices and their financial statements), and partly by looking at operational data (correlations between retail-industry stock prices and satellite pictures of retailers’ parking lots), and partly by looking at macroeconomic data (correlations between stock prices and interest rates or oil prices), and partly by looking at whatever else is handy and might somehow predict stock prices (correlations between stock prices and sunspots, or Twitter sentiment). And as more data is available and analytical techniques improve, they will get better and better at all of this. Along the way, there will be missteps and spurious correlations and herding into bad ideas, but in the very long run you’d expect the robots to constantly improve their capital-allocation decisions.

I mean, I would, though I don’t assert it as a law of economics or computer science. It’s more of an aesthetic sense about the possibilities for technology.

One other thing to consider is that eventually the best robot will predictably and repeatedly outperform the second-best robot, so why would you invest with the second-best robot? (This is, to some extent, what it means when people say that returns to algorithmic investing are declining.) Modern investment management supports a diversity of investing styles and products in part because people have truly different preferences about risk and where they want to invest, but also in part because it is hard to tell which style will perform better in the future. But as the Best Capital Allocating Robot gets better and better at allocating capital, it will be harder to justify investing in the Concentrated Mid-Cap Ultra Value Fund or whatever. Just invest in the Best Capital Allocating Robot! 18 He’s the best.

So the logical/whimsical end point is, if you want to invest in U.S. business, you give your money to the Best Capital Allocating Robot (U.S. Division), and that robot — whose prowess has been proven over time in fierce competition — applies the best algorithms to the best data set to make the best possible capital-allocation decisions, and you get the best returns, and the economy gets the best capital allocation.

Obviously this is all nonsense. I mean! Obviously! The robots will always be imperfect, and random chance will always intrude, and decisions based on past data will never perfectly predict the future, and investing preferences will always differ, and you’ll never be able to scientifically identify the best algorithm, and competition and diversity will always be important, and all of this is silly.

But isn’t it fun? I have joked, a couple of times, about modern financial capitalism solving the socialist calculation problem. One of those times was about Uber, which is apparently working on an algorithm to allocate cars based on data in the world, without the intermediation of a price system. There will just … magically … be more Ubers when demand is high, and fewer when demand is low, and that won’t be because the invisible hand of the market pulls more self-interested drivers onto the streets as more passengers bid up the price of their service, but just because Uber has a computer program that is really smart at telling cars where to go. The Best Capital Allocating Robot will be like that, only for financial capital instead of cars.

That is: The market is the best algorithm ever developed for allocating capital. So far! But it also creates incentives for someone to build a better algorithm.

Again: I know this is silly. But as a wild extrapolation of the far future of financial capitalism, I submit to you that it is less silly than the  “Silent Road to Serfdom” thesis. That thesis is that, in the long run, financial markets will tend toward mindlessness, a sort of central planning — by an index fund — that is worse than 1950s communism because it’s not even trying to make the right decisions.

The alternative view is that, in the long run, financial markets will tend toward perfect knowledge, a sort of central planning — by the Best Capital Allocating Robot — that is better than Marxism because it is perfectly informed and ideally rational. And once you have that, you can shut down the market: The game is over, and the Best Capital Allocating Robot won. The Fraser-Jenkins thesis is that algorithmic investing runs the risk of destroying capitalism by abandoning the pursuit of knowledge. But the really fun alternative is that it runs the risk of destroying capitalism by perfecting that pursuit: Once you have solved the socialist calculation problem, what do you need markets for?

How WeChat faded into the silence in India

Circa 2010, China: The aftermath of Urumqi riots in 2009 resulted in a ban and blockage of 1.3 million websites in the country. Even large internet companies like Fanfou, the Middle Kingdom’s first microblogging platform, shut down.

The next few years saw a raft of new internet companies launching new products and platforms. Meituan-Dianping, Toutiao, and Didi Chuxing, which today rank among world’s top 20 technology companies, are, in a way, products of the 2010 clampdown.

One such launch made by Allen Zhang, a product geek in China best known for creating and selling Foxmail, happened in 2010 at Tencent office in the southern Chinese city of Guangzhou. Zhang assembled a team of 10 engineers to build WeChat, what is today the world’s largest super app.

At that time, Tencent’s flagship product QQ, a desktop software for communication, was extremely popular. WeChat, known as Weixin when it launched, with a minimalistic design, was in stark contrast to the cluttered design of QQ.

After a historic leap of reaching its first 100 million users in just 433 days, WeChat decided to go global with the product. India, where Tencent already had a presence through travel services portal Ibibo in which it held a stake, was a natural choice to test the waters.

In early 2012, a team of a dozen employees was assembled in Gurgaon to launch a huge marketing campaign.

“WeChat was the first mobile app to launch a television ad in India. There was no expense spared,” says a former WeChat India executive from the time, asking to stay anonymous. Movie stars Parineeti Chopra and Varun Dhawan were roped in as brand ambassadors. There were ad campaigns that ran on TV and radio and in malls.

The app had good early traction. “We had gained about 20-25 mn subscribers during the campaign. For 45 days, WeChat was the top ranking app on Google Playstore,” the former executive says.

Initially, it appeared that WeChat’s China playbook was successful in India.

But like a roller coaster in an adventure park, the WeChat trajectory turned direction and began to nosedive fast. The number of uninstalls increased. There was no stickiness to the app. Soon, there were news reports that suggested that the app could be banned by the government. That was the beginning of the end for WeChat in India.

FactorDaily reached out to Tencent and WeChat teams to get their inputs for the story but didn’t receive any response.

Why Indians didn’t we-chat for long

For its launch in India, WeChat’s parent Tencent assembled a team from Ibibo, including three or four developers, half a dozen people in marketing, and a couple in senior management. The team was led by Rahul Razdan and Nilay Arora. (Arora is the current country head for Tencent.)

“The product was designed with Allen Zhang’s vision and had peculiar features to the effect that worked well in China market but didn’t work well for Indian users,” says Himanshu Gupta, who was the associate director of marketing and strategy at WeChat India, from 2012 to 2015.

For example, WeChat made it mandatory for users to send “add friend requests” which had to be accepted by the other person for the chat to begin. Compared to this, in the case of WhatsApp, you could chat with anyone whose number was on your contacts list and had the app installed. The assumption made here was that if you had someone’s number stored on your phone, you knew him or her.

WeChat’s approach of enforcing “add-friend-requests” was similar to how Facebook works even today, where you can’t see other person’s private content unless they’ve accepted your friendship request. WeChat did this because not only was it a messaging app but also a social app and has a newsfeed called “Moments” which is similar to Facebook’s newsfeed – giving rise to user privacy concerns.

But this added a layer of friction to starting a chat interaction on WeChat, especially in the case of group chats. Chat groups are usually made by a single person (admin) inviting people by adding their phone numbers to the group. But this wasn’t possible seamlessly on WeChat in India since the admin had to first send “add friend requests” which had to be accepted. WeChat had not faced this hurdle of adding friends in China because Tencent’s hit messaging product from the desktop era – QQ messenger – already had over 750 million monthly active users by the time WeChat launched. A user could port her entire QQ social graph to WeChat by just logging in with her QQ ID.

Himanshu Gupta, former associate director of marketing and strategy, WeChat India

Another design feature in the app allowed users to look up and send add-friend requests to WeChat users nearby. During initial onboarding when users were just checking app’s features, many would tap the “people nearby” feature, which would switch on location sharing by default – including with strangers. Once location sharing with strangers was switched on, it wasn’t very intuitive to turn it off.

“Women used to get a lot of unwarranted messages from men, which was a major turn off and many of them left the platform,” Gupta says. “China probably didn’t have this stalking problem.”

When this feedback was reported to China, the cultural nuance was missed or executives there didn’t think of these features as potential challenges, he adds.

Despite these challenges, WeChat had found a user base that stuck to that app for a year or so until they moved on to WhatsApp. With its 200 million monthly active users in India as of February 2018, the Facebook-owned app, which currently supports 10 Indian languages, has the largest market share among messengers here.

Food and lifestyle blogger Suman Prasad who used WeChat as a post-grad student during 2013 said that a lot of his friends used WeChat for group chats, animated stickers and broadcast messages. “WeChat had great potential as a brand but I believe they couldn’t read the Indian mind better and failed to match the changing preferences of the young customers,” he says.

When it launched in India the size of WeChat app was 40 MB. Most popular mobile phones in India at the time came with less than 100 MB of internal memory. With memory being an issue, WeChat was out of the window at the first roadblock. Also, what set the company a little behind the competition was the limitation that WeChat was only available on Android and iOS versions at a time when India still had a significant Symbian OS and BlackBerry market.

When WeChat was at its peak in India, news reports with claims that the Indian government planned to ban the Chinese app started doing rounds in 2013. “WeChat’s downfall started after a smear campaign that said the government officials planned to ban WeChat. Up until then, most users didn’t even know that it was a Chinese app; they had seen Varun Dhawan and Parineeti Chopra promoting the app,” says the former WeChat executive quoted earlier without a name.

Localisation challenges

The story of WeChat’s success in China is phenomenal. A tight ecosystem of over a billion users, WeChat is a super app that the Chinese use for everything – to hail a taxi, order food, or buy movie tickets and medicines.

Besides creating local content for India like Diwali stickers and housing some bit of technical support for partners in India, the company’s major focus remained to get more brands to sign up on the platform, to create a China-like ecosystem where brands ran promotions and gave discounts to users to follow them on WeChat, says Gupta. But in order to get more brands, it was essential to have a growing user base and an increasing amount of time spent on the app.

Getting brands on a chat platform is tough. Facebook’s Messenger is still struggling with it. It involves creating a large base of users that spend a good part of their day on the platform, creating segments of these users, getting brands that promote themselves to target user segments, and in the process creating stickiness for the app. WeChat was successfully able to do that in China because it was already the most prefered chat app in the country and had data of millions of users, thanks to the long innings of QQ, its parent app.

With hindsight, Gupta says, if the approach to marketing in India had focused on fixing the product design instead of focusing on brands, things might have turned out differently for the super app. But since WeChat was fighting an extremely competitive battle with Alibaba in China at that point in time, it was less keen to invest in product changes that weren’t relevant to the China market. The idea was to take a product that was doing well in China and go global by doing localisations around the brand and ecosystem partnerships, supported by aggressive marketing.

In China, where the internet was cheaper than in India in 2012, sending video files of, say, 4 MB was not a challenge. WhatsApp compresses a 5 MB photo to 40 kilobytes. WeChat did not compress the files and took many minutes and data to send and receive media files.

“There’s a term called internationalisation of a product, which is loosely used in China to translate the product to English and make it ready for users in another country. But at its core, it is a product envisaged and designed by people in China for the China market,” says the former WeChat executive, citing it as the main reason for WeChat’s failure in India.

Gupta echoes the same concern about the leadership. Even if some changes were made in the product, they were cosmetic and took long to be implemented. “We did get a Blackberry and Nokia Symbian builds for the app but that took more than a year and a half and, by that time, these mobile operating systems which were once big were almost dead, and the product WeChat had already lost all momentum the India market,” he says.

The messenger market in India

By the time WeChat launched in India market, Indians had already got the taste of messengers. There was WhatsApp that was already popular in India, teaming up with carriers like Reliance for a bundled offering at Rs 16 a month. Facebook had launched its messenger. Homegrown Hike had also launched around the same time as WeChat. There was suddenly an influx of messengers, including LINE, Viber, Skype, and Hangouts that offered, voice messages, video calls, free calling and stickers to Indian users.

After the initial customer acquisition tactics like tie-ups with telco provider and free talk-time for successful referrals, each of these products either found their niche or quit operating in India, barring WhatsApp that turned out to be more or less an incumbent in India.

In February 2013, Hike topped on Playstore using a reward based referral program and then sank. A few months down the line, WeChat and LINE also launched their marketing campaigns in India.

LINE, the Japanese app with Korean roots, came to India in July 2013 and reached #1 on app stores for a few weeks, with its own TV ad-based marketing campaign. Soon after WeChat’s TV commercials, LINE launched its own commercial in June 2013 and that October roped in actor Katrina Kaif as brand ambassador.

“In the minds of Tencent management, at least early on, they were competing against LINE and not WhatsApp because both of them were aggressively expanding globally with huge marketing campaigns,” says Gupta, who now heads growth at Walnut, an expense tracking and lending company.

The reason Tencent management was focussed on LINE in India was that WhatsApp was a passive player. WhatsApp was growing organically and there was nothing WeChat could do from a country launch point of view to stop it, says Gupta. But LINE was launching in countries globally left-right-and-centre, not dissimilar to how Uber and various taxi companies fought in a land-grab mode globally.

Around mid-2015, WeChat realised that it had hit a dead end in India. About a year after, Tencent led a funding round of $175 million in Hike in August 2016, raising its valuation to $1.4 billion. Hike, however, is still struggling with its monetization and is stuck with a user base of around 100 million for two years now. Safe to say Tencent’s bet with messengers in India hasn’t gone down well.

India has also witnessed its share of apps that have taken the super app route in India. Sequoia-backed Tapzo that has had its fair share of pivots, from online complaints redressal to a utility app aggregator before it was sold to Amazon Pay in September this year. There’s also Alibaba-backed PayTM that aims to be the WeChat of India.

As opposed to WhatsApp that was primarily built for emerging markets and had features like the low-memory build of the app and compressed media exchange features, players like WeChat and LINE started out in markets such as China and Japan that had a better quality infrastructure. Built for richer media interactions such as stickers, voice messages and video calling etc, WeChat and LINE, rather than fundamentally changing their products to suit the low-end phone markets, which India was in 2013, assumed that the world would eventually move to better quality phones and internet speeds, where their apps would provide a better messaging experience, says Gupta. While that did happen eventually with the launch of Jio in the year 2016 in India, the network effects of WhatsApp by that time had become so strong that no other player was left standing in the messaging space in the emerging markets.

The Chinese learn India

To be sure, WeChat’s problems in India are not unique. Chinese apps and services from other sectors, too, report similar troubles. The world’s largest bike ride-sharing company ofo launched in India in February this year only to wind up operations in less than six months.

When ofo launched in India early this year, the bike sharing market was rife with many players, including, another Chinese leader Mobike, Zoomcar’s Pedl and Bengaluru-based Yulu, among others. These startups were introducing a new segment within urban India which was a level playing field, open to experimentation.

Rajarshi Sahai, who handled ofo’s India operations, says that despite that fact that ofo created its own playbook in terms of ground operations and policies designed specifically for India market, the company took a hit when it came to the agility of its app product. Ofo India team’s attempts to localise or customise the app that was operational in 22 countries and got 32 million hits per day, went through long gestation periods and prioritisation cycles, while its smaller and more agile rivals were faster when it came to improving their apps to suit Indian users.

Rajarshi Sahai, urban mobility expert & former head of ofo, India

Sahai says that for Chinese companies mainly those supported by the BAT (Baidu, Alibaba, Tencent) trio, China market is always the core business and the rest of the market is an expansion, which creates some sort of inertia when dealing with the competition here in India.

But, it has become increasingly clear that India as a market cannot be generalised. As Sajith Pai, who works with VC firm Blume Ventures, puts it, India is divided into three consumer segments: the first 100 million, mainly the urban or affluent Indians and are the main targets of indulgent e-commerce brands; the second 100 million classified as the aspiring class; and the last a little over a billion — three segments he calls the splurgers, strivers and survivors.

Pai says that most global companies investing in India, including Apple, Facebook and Instagram are aware of this graph and think of India as a secondary market by targeting the first 100 million. Things may be changing from the WeChat days with a new crop of Chinese companies trying to cater to the new 100 million, including rural India, new internet users and youngsters. Case in point: MX player, NewsDog, Shareit, and UCBrowser that cater to the new internet users in India have a higher chance of surviving in India because they understand the new Indian users better.

Still, unlike China where there is a big government-initiated push for a common language and similar culture, cultural diversity in India, like lack of a common language, city structures, and economic disparity makes it difficult to generalize the Indian market. That’s the lesson that Tencent – and, indeed, the BAT trio – seems to have learnt and are only investing in large Indian companies.