CiteseerBiGraph
- class dhg.data.CiteseerBiGraph(data_root=None)[source]
Bases:
dhg.data.base.BaseData
The CiteseerBiGraph dataset is a citation network dataset for vertex classification task. These are synthetic bipartite graph datasets that are generated from citation networks (single graph) where documents and citation links between them are treated as nodes and undirected edges, respectively. More details see the Cascade-BGNN: Toward Efficient Self-supervised Representation Learning on Large-scale Bipartite Graphs paper.
The content of the CiteseerBiGraph dataset includes the following:
num_u_classes
: The number of classes in set \(U\) : \(6\).num_u_vertices
: The number of vertices in set \(U\) : \(1,237\).num_v_vertices
: The number of vertices in set \(V\) : \(742\).num_edges
: The number of edges: \(1,665\).dim_u_features
: The dimension of features in set \(U\) : \(3,703\).dim_v_features
: The dimension of features in set \(V\) : \(3,703\).u_features
: The vertex feature matrix in set \(U\).torch.Tensor
with size \((1,237 \times 3,703)\).v_features
: The vertex feature matrix in set \(V\) .torch.Tensor
with size \((742 \times 3,703)\).edge_list
: The edge list.List
with length \((1,665 \times 2)\).u_labels
: The label list in set \(U\) .torch.LongTensor
with size \((1,237, )\).
- Parameters
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.