Yelp3k

class dhg.data.Yelp3k(data_root=None)[source]

Bases: dhg.data.base.BaseData

The Yelp3k dataset is a subset of Yelp-Restaurant dataset for vertex classification task. It is a restaurant-review network. All businesses in the “restaurant” catalog are selected as our nodes, and formed hyperedges by selecting restaurants visited by the same user. We use the state of the business as the corresponding node label.

The content of the Yelp-Restaurant dataset includes the following:

  • num_classes: The number of classes: \(6\).

  • num_vertices: The number of vertices: \(3,855\).

  • num_edges: The number of edges: \(24,137\).

  • dim_features: The dimension of features: \(1,862\).

  • features: The vertex feature matrix. torch.Tensor with size \((3,855 \times 1,862)\).

  • edge_list: The edge list. List with length \(24,137\).

  • labels: The label list. torch.LongTensor with size \((3,855, )\).

Parameters

data_root (str, optional) – The data_root has stored the data. If set to None, this function will auto-download from server and save into the default direction ~/.dhg/datasets/. Defaults to None.