Tencent2k

class dhg.data.Tencent2k(data_root=None)[source]

Bases: dhg.data.base.BaseData

The Tencent2k dataset is a social network dataset for vertex classification task. It is a subset of TencentBiGraph dataset. The nodes are social network users. Nodes are connected by a hyperedge if the corresponding users join the same social communities.

The content of the Tencent2k dataset includes the following:

  • num_classes: The number of classes: \(2\).

  • num_vertices: The number of vertices: \(2,146\).

  • num_edges: The number of edges: \(6,378\).

  • dim_features: The dimension of features: \(8\).

  • features: The vertex feature matrix. torch.Tensor with size \((2,146 \times 8)\).

  • edge_list: The edge list. List with length \(6,378\).

  • labels: The label list. torch.LongTensor with size \((2,146, )\).

  • train_mask: The train mask. torch.BoolTensor with size \((2,146, )\).

  • val_mask: The validation mask. torch.BoolTensor with size \((2,146, )\).

  • test_mask: The test mask. torch.BoolTensor with size \((2,146, )\).

Parameters

data_root (str, optional) – The data_root has stored the data. If set to None, this function will auto-download from server and save into the default direction ~/.dhg/datasets/. Defaults to None.