News20
- class dhg.data.News20(data_root=None)[source]
Bases:
dhg.data.base.BaseData
The 20 Newsgroups dataset is a newspaper network dataset for vertex classification task. The vertex features are the TF-IDF representations of news messages. More details see the YOU ARE ALLSET: A MULTISET LEARNING FRAMEWORK FOR HYPERGRAPH NEURAL NETWORKS paper.
The content of the 20 Newsgroups dataset includes the following:
num_classes
: The number of classes: \(4\).num_vertices
: The number of vertices: \(16,342\).num_edges
: The number of edges: \(100\).dim_features
: The dimension of features: \(1,433\).features
: The vertex feature matrix.torch.Tensor
with size \((16,342 \times 100)\).edge_list
: The edge list.List
with length \(100\).labels
: The label list.torch.LongTensor
with size \((16,342, )\).
- Parameters
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.