YelpRestaurant
- class dhg.data.YelpRestaurant(data_root=None)[source]
Bases:
dhg.data.base.BaseData
The Yelp-Restaurant dataset is a restaurant-review network dataset for vertex classification task. All businesses in the “restaurant” catalog are selected as our nodes, and formed hyperedges by selecting restaurants visited by the same user. We use the number of stars in the average review of a restaurant as the corresponding node label, starting from 1 and going up to 5 stars, with an interval of 0.5 stars. We then form the node features from the latitude, longitude, one-hot encoding of city and state, and bag-of-word encoding of the top-1000 words in the name of the corresponding restaurants. More details see the YOU ARE ALLSET: A MULTISET LEARNING FRAMEWORK FOR HYPERGRAPH NEURAL NETWORKS paper.
The content of the Yelp-Restaurant dataset includes the following:
num_classes
: The number of classes: \(11\).num_vertices
: The number of vertices: \(50,758\).num_edges
: The number of edges: \(679,302\).dim_features
: The dimension of features: \(1,862\).features
: The vertex feature matrix.torch.Tensor
with size \((50,758 \times 1,862)\).edge_list
: The edge list.List
with length \(679,302\).labels
: The label list.torch.LongTensor
with size \((50,758, )\).state
: The state list.torch.LongTensor
with size \((50,758, )\).city
: The city list.torch.LongTensor
with size \((50,758, )\).
- Parameters
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.