Citation Network Dataset Download
Which are quite small compared to the number of research papers 50 millions.
Citation network dataset download. A graph and network repository containing hundreds of real world networks and benchmark datasets. The dictionary consists of 1433 unique words. At least one citation or reference in the network. The data set is designed for research purpose only.
The citation data is extracted from dblp acm mag microsoft academic graph and other sources. Each line of the files corresponding to an attributes of a paper. Each datasets contains at least three files. Each publication in the dataset is described by a 0 1 valued word vector indicating the absence presence of the corresponding word from the dictionary.
Idxs txt links txt and texts txt. Nodes represent papers edges represent citations. This large comprehensive collection of graphs are useful in machine learning and network science. Each publication in the dataset is described by a 0 1 valued word vector indicating the absence presence of the corresponding word from the dictionary.
We also provide interactive visual graph mining. Network data can be visualized and explored in real time on the web via our web based interactive network visual analytics platform. The citation network consists of 4732 links. So far i have found.
Stanford large network dataset collection. Patent citation network dataset information. Ground truth network communities in social and information networks. The citation network consists of 5429 links.
Dblp citation acm citation network. The readme file in the dataset provides more details. Networks with ground truth communities. This includes social network data brain networks temporal network data web graph datasets road networks retweet networks labeled graphs and numerous other real world graph datasets.
Acm citation network and dblp citation network v8. All data sets are easily downloaded into a standard consistent format. Labels txt is presented if the labels are avaliable. The manuscript can be found on arxiv 1905 00075 our primary purpose is to develop a set of tools to standardize and facilitate use of the arxiv as a dataset.
What is network repository. Patent dataset is maintained by the national bureau of economic research the data set spans 37 years january 1 1963 to december 30 1999 and includes all the utility patents granted during that period totaling 3 923 922 patents. Email communication networks with edges representing communication. I am looking for an exhaustive citation network dataset for research papers ideally identified with dois.
The following datasets had been cleaned into a fixed format for fast accessing. Online social networks edges represent interactions between people. The dictionary consists of 3703 unique words.