With little training, machine-learning algorithms can uncover hidden scientific knowledge

Science Daily  July 3, 2019
A team of researchers in the US (Lawrence Berkeley National Laboratory, UC Berkeley) fed 3.3 million abstracts from papers on materials science published in more than 1,000 journals between 1922 and 2018 into an algorithm called Word2vec. The algorithm took each of the approximately 500,000 distinct words in those abstracts and turned each into a 200-dimensional vector, or an array of 200 numbers, and predicted discoveries of new thermoelectric materials years in advance and suggested as-yet unknown materials as candidates for thermoelectric materials. The research suggests that latent knowledge regarding future discoveries is to a large extent embedded in past publications…read more. TECHNICAL ARTICLE

Posted in Bibliometrics and tagged .

Leave a Reply