Detecting anomalies in financial statements using machine learning algorithm: The case of Vietnamese listed firms
Hoai Nam, Vuong
MetadataShow full item record
The vast amount of data and the increasing development in technology in recent years have changed the way in which many industries operate and compete with each other. Millions of bytes, commonly referred to as big data, provide valuable insights for companies to make informed business decisions. Companies that conduct business in the financial service sector employ big data to inform their investment practices and make strategic decisions. The increased use and complexity of big data poses a challenge to users of financial information when analyzing financial statements. This is especially applicable to users who possess fewer financial resources and have inferior knowledge to conduct in-depth analysis of financial statements (Lokanan, 2014). Companies that wants to present a rosy picture of their financial position, may exploits these users’ deficiencies through deliberate misleading and omission of financial data in their annual reports (Rezaee, 2002; Albrecht et al., 2006; 2014; Robinson and Lokanan, 2017). Vietnamese companies were selected because of the high incidence of financial reports manipulation (Tran, 2013). The number of listed companies reported by Hanoi Stock Exchange (HNX) and Ho Chi Minh City Stock Exchange (HOSE) from 2000, when Vietnam’s security market was in its infancy stage, to 2016 has steadily increased. In 2016, there were more than 1,000 listed companies on these exchanges. Growth and structural development in Vietnam’s financial markets comes with intense competition in the marketplace and the possibility of financial statement manipulation of listed companies on the HNX and HOSE (Tran, 2013). Indeed, there has been an increasing number of failed companies and fraudulent reporting in Vietnamese markets in the last few years. To be specific, 6,608 companies in the first seven months of 2017, 12,478 companies in 2016 and 9,467 companies in 2015 (Agency of Business Registration, 2017). The volume and intensity of fraudulent reporting have made it difficult for humans to process and analyze anomalous transactions (Grace et al., 2017). Even some traditional statistic regression techniques cannot be applied due to the complexity of data set (Fan and Li, 2006). Thus, we need embedded analytical models with highly-automated operating structures to deal with the large volume, variety of features, and velocity of the data that the human brain cannot handle. This is where big data techniques come into play. Big data have brought with it novel techniques, such as machine learning and algorithms, that allow users to conduct in-depth analysis and gain deeper understanding of anomalies in financial statements. The analysis of big data using machine learning techniques can assist users of financial statements to detect unusual patterns and transactions in companies’ financials. Big data are massive and can be used by both users and companies to provide data-centric and data-driven insights on financial statement anomalies. This study is an attempt to use machine learning algorithms to detect anomalies in financial statements in Vietnamese listed firms. As mentioned, the only resources available to ordinary investors are quarterly reports, which may contain misleading financial information. It is not enough just to look at the original state of such financial reports. Much research has proved efficiency by analyzing financial ratios calculated from the values in companies’ reports (see Altman, 1968; Kotsiantis et al., 2006; Pustylnick, 2011). Therefore, we approached the problem by using financial ratios as a series of variables, also known as features. An important point in this paper is that the values of financial ratios are assumed to follow a multivariate distribution, which means each ratio varies around one specific mean value. This assumption will allow us to point out anomalous data by measuring whether the distance of each datum to the ‘centroid’ (which will be explained in Research Methodology) exceeds a certain threshold. Additionally, we will take the concept of distance further by regarding it as the degree or extent of the anomaly. This extension of understanding enables us to rank the credit worthiness of each company in each quarter: the more anomalous a datum, the less credit-worthy it is. Therefore, the central question of this paper is as follows: is it possible to rate the creditworthiness of a firm’s financial quarter using an anomaly detection method?
DescriptionThis is the accepted manuscript version of the article. A link to the definitive version of record will be posted here once available.
Showing items related by title, author, creator and subject.
Lokanan, Mark (European Accounting Review, 2017)The paper maintains that all acts of financial crimes can be explained within a general theory of moral action and analyzed as such. In this regard, the paper presents such a theory – Situational Action Theory (SAT) – and ...
Ferrigan, Jason (Canadian Institute of Planners, 2002)In the past twenty years many American cities have turned their collective attention back to their downtowns and waterfronts by creating strategies to attract and guide new public and private investment and reverse ...