BlogCatalog
- class dhg.data.BlogCatalog(data_root=None)[source]
Bases:
dhg.data.base.BaseData
The BlogCatalog dataset is a social network dataset for vertex classification task. This is a network of social relationships of bloggers from the BlogCatalog website, where nodes’ attributes are constructed by keywords, which are generated by users as a short description of their blogs. The labels represent the topic categories provided by the authors.
Note
The L1-normalization for the feature is not recommended for this dataset.
The content of the BlogCatalog dataset includes the following:
num_classes
: The number of classes: \(6\).num_vertices
: The number of vertices: \(5,196\).num_edges
: The number of edges: \(343,486\).dim_features
: The dimension of features: \(8,189\).features
: The vertex feature matrix.torch.Tensor
with size \((5,196 \times 8,189)\).edge_list
: The edge list.List
with length \((343,486 \times 2)\).labels
: The label list.torch.LongTensor
with size \((5,196, )\).
- Parameters
data_root (
str
, optional) – Thedata_root
has stored the data. If set toNone
, this function will auto-download from server and save into the default direction~/.dhg/datasets/
. Defaults toNone
.