dhg.data
Base Class
- class dhg.data.BaseData(name, data_root=None)[source]
The Base Class of all datasets.
self._content = { 'item': { 'upon': [ {'filename': 'part1.pkl', 'md5': 'xxxxx',}, {'filename': 'part2.pkl', 'md5': 'xxxxx',}, ], 'loader': loader_function, 'preprocess': [datapipe1, datapipe2], }, ... }
- property content
Return the content of the dataset.
- fetch_files(files)[source]
Download and check the files if they are not exist.
- Parameters
files (
List[Dict[str, str]]) – The files to download, each element in the list is a dict with at lease two keys:filenameandmd5. If extra keybk_urlis provided, it will be used to download the file from the backup url.
Graph Datasets
The Cora dataset is a citation network dataset for vertex classification task. |
|
The PubMed dataset is a citation network dataset for vertex classification task. |
|
The Citeseer dataset is a citation network dataset for vertex classification task. |
|
The BlogCatalog dataset is a social network dataset for vertex classification task. |
|
The Flickr dataset is a social network dataset for vertex classification task. |
|
The Github dataset is a collaboration network dataset for vertex classification task. |
|
The Facebook dataset is a social network dataset for vertex classification task. |
Bipartite Graph Datasets
The MovieLens1M dataset is collected for user-item recommendation task. |
|
The AmazonBook dataset is collected for user-item recommendation task. |
|
The Yelp2018 dataset is collected for user-item recommendation task. |
|
The Gowalla dataset is collected for user-item recommendation task. |
|
The TencentBiGraph dataset is a social network dataset for vertex classification task. |
|
The CoraBiGraph dataset is a citation network dataset for vertex classification task. |
|
The PubmedBiGraph dataset is a citation network dataset for vertex classification task. |
|
The CiteseerBiGraph dataset is a citation network dataset for vertex classification task. |
Hypergraph Datasets
The Cooking 200 dataset is collected from Yummly.com for vertex classification task. |
|
The Co-authorship Cora dataset is a citation network dataset for vertex classification task. |
|
The Co-authorship DBLP dataset is a citation network dataset for vertex classification task. |
|
The Co-citation Cora dataset is a citation network dataset for vertex classification task. |
|
The Co-citation Citeseer dataset is a citation network dataset for vertex classification task. |
|
The Co-citation PubMed dataset is a citation network dataset for vertex classification task. |
|
The Yelp-Restaurant dataset is a restaurant-review network dataset for vertex classification task. |
|
The Walmart Trips dataset is a user-product network dataset for vertex classification task. |
|
The House Committees dataset is a committee network dataset for vertex classification task. |
|
The 20 Newsgroups dataset is a newspaper network dataset for vertex classification task. |
|
The DBLP-4k dataset is a citation network dataset for node classification task. |
|
The DBLP-8k dataset is a citation network dataset for link prediction task. |
|
The IMDB-4k dataset is a movie dataset for node classification task. |
|
The Recipe100k dataset is a recipe-ingredient network dataset for vertex classification task. |
|
The Recipe200k dataset is a recipe-ingredient network dataset for vertex classification task. |
|
The Yelp3k dataset is a subset of Yelp-Restaurant dataset for vertex classification task. |
|
The Tencent2k dataset is a social network dataset for vertex classification task. |
Welcome to contribute datasets!