论文标题
您想在哪里投资?从自由公开可用的网络信息中预测启动资金
Where Do You Want To Invest? Predicting Startup Funding From Freely, Publicly Available Web Information
论文作者
论文摘要
我们在本文中考虑了预测初创公司使用自由公开可用数据吸引投资的能力的问题。有关网络启动的信息通常是作为来自新闻,社交网络和网站的非结构化数据,也可以作为商业数据库(例如Crunchbase)的结构化数据。在文献中已经研究了从结构化数据库中预测创业公司成功的可能性,并且已经表明,可以通过各种机器学习技术来预测最初的公共产品(IPO),合并和收购(M \&A)以及资金事件。在此类研究中,网络和社交网络的异质信息通常被用作来自数据库的信息的补充。但是,构建和维护此类数据库需要巨大的人为努力。因此,我们在这里研究是否只能仅依靠可用的信息来源,例如初创公司的网站,其社交媒体活动以及其在网络上的存在,以预测其资金活动。如我们的实验中所示,我们提出的方法产生的结果与私人数据库中可用的结构化数据相当。
We consider in this paper the problem of predicting the ability of a startup to attract investments using freely, publicly available data. Information about startups on the web usually comes either as unstructured data from news, social networks, and websites or as structured data from commercial databases, such as Crunchbase. The possibility of predicting the success of a startup from structured databases has been studied in the literature and it has been shown that initial public offerings (IPOs), mergers and acquisitions (M\&A) as well as funding events can be predicted with various machine learning techniques. In such studies, heterogeneous information from the web and social networks is usually used as a complement to the information coming from databases. However, building and maintaining such databases demands tremendous human effort. We thus study here whether one can solely rely on readily available sources of information, such as the website of a startup, its social media activity as well as its presence on the web, to predict its funding events. As illustrated in our experiments, the method we propose yields results comparable to the ones making also use of structured data available in private databases.