Photo by Raymond Tan on Unsplash

Tax administrators around the world are frustrated that they cannot catch cross-border transaction-based tax evasion on digital platforms. Many of these transactions are related to Daigous.

‘Daigou’ is originally a Chinese term that means three things: a group of people who are buying agents, they buy overseas products outside of China and ship the products to residents in mainland China; the behaviour of acting as buying agents; and an ‘industry’. It now refers broadly to an e-commerce channel between buyers and professional shoppers locating in different countries. Daigou activities can result in criminal offences such as smuggling, tax evasion, and money laundering.

Daigous have also exploited social media platforms to engage in black-market activities, in effect participating in the digitalised hidden economy. Our research has developed a machine learning strategy that enabled the detection of digitalised hidden economy activities on social media platforms. We explored an example through Instagram. This research will contribute to help address tax evasion arising from these Daigou transactions, which is necessary to detect and eliminate black-market (hidden economy) activities.

How do Daigous conduct transactions?

Social e-commerce refers to a business model in which the buying and selling process is completed on social media platforms. This model consists of cross-border and domestic transactions, where sellers take advantage of their extended social networks. Activities include all aspects of e-commerce, from instant messaging to answering customer enquiries to receiving payments via third party payment methods that are Fintech tools. Here, the social media platform is the primary place to generate business transactions.

There is evidence that daigou transactions between Australia and China, where Chinese consumers are the final purchasers, are widespread. A 2018 episode of A Current Affair (as reported here) addressed the topic of the sizable Daigou industry. The show revealed that the market size of the global Daigou industry is about AUD$15 billion. The number of participants in the Daigou industry in Australia is around 200,000 and China is the destination to which most of their purchases are exported.

Daigou and the hidden economy

The hidden economy is defined as ‘those economic activities and the income derived from them that circumvent or otherwise avoid government regulation, taxation or observation.’ The predominate platforms on which Daigous engage are social media platforms that are digital payments enabled. Hidden online Daigou transactions share the same characteristics as traditional hidden economy transactions, where merchants prefer to receive anonymous or pseudonymous payments and do not declare the taxable income.

The rapid development of the Daigou industry may have resulted in serious tax evasion of income tax, and potentially also Goods and Services Tax (GST), in both source and destination countries. For Australia, it has been estimated that ‘up to $1 billion in undeclared taxable income may be slipping through the net, leaving a potential tax bill in the hundreds of millions.’

Machine learning based Regtech tool to achieve detection

Daigous leave digital footprints in their social media transactions, which enable detection with a suitable data-driven approach. In our recent research paper, we developed a case study to conduct an experiment on Instagram to search for Daigou transactions. We used #lipstick as the key search word to detect posts which are related to hidden economy activities. We built a design science artifact – a machine learning based Regtech tool for international tax authorities to detect transaction-based tax evasion activities on social media platforms.

To achieve detection, there were three stages in our research:

  1. Data mining using Python (a type of high-level programming language);
  2. Qualitative manual labelling to develop insights to train the machine;
  3. Developing the Regtech tool to detect transaction-based tax evasion activities on Instagram.

To build the dataset, the study employed data trawled from publicly available Instagram posts, including their corresponding poster information. Instagram posts were mined using the hashtag #lipstick in the period from 22 to 26 September 2019.

For each Instagram post, our study collected the username, post timestamp, number of likes, image, post text, and comments. The original post text was included as the first comment due to the way Instagram presents the posts. The study also extracted hashtags from the comments, as these usually form a significant part of the textual information. The study collected a total of 58,660 posts (short-lived and duplicated posts included) and from this data we produced a dataset of 2,081 randomly sampled unique posts for manual data mining.

Stage Two in our project was the data treatment process to build the training dataset for machine learning purposes. Before labelling, individual posts were examined for the purpose of designing the labelling codes. Nine properties were codified in the form of true or false questions or multiple-choice questions, where the questions can be answered in a form that the machine learning model can understand. This is similar to coding answers to survey questions, but the researcher does not have to ask the questions in the data collection process.

Figure 1: Questions to Ask in the Data Labelling Process

Evidence of hidden economy transactions

Our analysis indicates that 22.21 per cent (464 out of 2081) of the sampled available posts are related to hidden economy transactions and thereby may result in tax evasion (see Figure 2 for an example of the posts). This high proportion suggests that hidden economy transactions on social media platforms have become very common and may lead to significant tax revenue loss. For further labelling results, please refer to our paper.

Figure 2: A Post on Instagram that Relates to Tax Evasion Activities

Based on these results, we developed a Regtech tool, which is a multi-modal deep neural network, to automatically detect suspicious posts. The proposed Regtech tool combines comments, hashtags and image modalities to produce the detection results.

Our model markedly improves the efficiency at detecting and confirming posts that relate to transaction-based tax evasion. Without the detection model, tax officers will need to randomly select the posts, as indicated above. Applying the Regtech tool we develop in our model enables an initial identification of suspicious posts before manual analysis. Tax officers will then manually confirm whether these posts relate to tax evasion.

Figure 3 represents the demo results of our Regtech tool, it gives the detection score of an Instagram post to signal its relevance to transaction-based tax evasion activities, considering the comments, hashtags, images, and their combinations within the post. The detection score ranges from 0 to 1, with ‘1’ means the machine regards the post as highly suspicious black-market sales.

Figure 3: Regtech Demo Results


Using our model Regtech tool, we can expect to achieve a 72 per cent identification of tax evasion activities based on the algorithmically selected suspicious posts. Therefore, with the same amount of effort, the efficiency can be improved by more than 3 times.

Tax revenue agencies, including the Australian Tax Office, are increasingly aiming to combine data mining and machine learning tools to detect tax evasion. There is evidence that social media is a fertile site of hidden economy transactions, for example by Daigous to the Chinese consumer market. Our research supports the evidence that there may be significant tax evasion in this area and that machine learning approaches can assist in tax enforcement. We have developed a machine learning tool that could help tax authorities to identify audit targets in an efficient and effective manner to combat social e-commerce tax evasion at scale.


This article is based on: Eva Huang and Xi Nan, “Transaction-Based Tax Evasion in The Cross-Border Digital Economy: The Case of Daigou Activities on Social Media Platforms” (2020) 26(3) New Zealand Journal of Taxation Law and Policy, pp. 269-294; and, Lelin Zhang, Xi Nan, Eva Huang and Sidong Liu, “Detecting Transaction-based Tax Evasion Activities on Social Media Platforms Using Multi-modal Deep Neural Networks” (2020),

Leave a comment

Your email address will not be published. Required fields are marked *