{"id":2773,"date":"2025-04-01T07:02:24","date_gmt":"2025-04-01T07:02:24","guid":{"rendered":"https:\/\/mailitics.com\/index.php\/2025\/04\/01\/graph-neural-networks-part-3-how-graphsage-handles-changing-graph-structure\/"},"modified":"2025-04-01T07:02:24","modified_gmt":"2025-04-01T07:02:24","slug":"graph-neural-networks-part-3-how-graphsage-handles-changing-graph-structure","status":"publish","type":"post","link":"https:\/\/mailitics.com\/index.php\/2025\/04\/01\/graph-neural-networks-part-3-how-graphsage-handles-changing-graph-structure\/","title":{"rendered":"Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure"},"content":{"rendered":"<p>    Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n    <!-- no image --><br \/>\n \t<BR><br \/>\n<BR><\/BR><\/p>\n<div>\n<p class=\"wp-block-paragraph\"><strong><mdspan datatext=\"el1743488306877\" class=\"mdspan-comment\">In the previous<\/mdspan> parts of this series, we looked at Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs). Both architectures work fine, but they also have some limitations! A big one is that for large graphs, calculating the node representations with GCNs and GATs will become v-e-r-y slow. Another limitation is that if the graph structure changes, GCNs and GATs will not be able to generalize. So if nodes are added to the graph, a GCN or GAT cannot make predictions for it. Luckily, these issues can be solved!<\/strong><\/p>\n<p class=\"wp-block-paragraph\">In this post, I will explain <a href=\"https:\/\/towardsdatascience.com\/tag\/graphsage\/\" title=\"Graphsage\">Graphsage<\/a> and how it solves common problems of GCNs and GATs. We will train GraphSAGE and use it for graph predictions to compare performance with GCNs and GATs.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">New to GNNs? You can start with\u00a0<a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-1-graph-convolutional-networks-explained-9c6aaa8a406e\/\" target=\"_blank\" rel=\"noreferrer noopener\">post 1 about GCNs<\/a>\u00a0(also containing the initial setup for running the code samples), and <a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-2-graph-attention-networks-vs-gcns-029efd7a1d92\/\" target=\"_blank\" rel=\"noreferrer noopener\">post 2 about GATs<\/a>.\u00a0<\/p>\n<\/blockquote>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h2 class=\"wp-block-heading\">Two Key Problems with GCNs and\u00a0GATs<\/h2>\n<p class=\"wp-block-paragraph\">I shortly touched upon it in the introduction, but let\u2019s dive a bit deeper. What are the problems with the previous GNN models?<\/p>\n<h3 class=\"wp-block-heading\">Problem 1. They don\u2019t generalize<\/h3>\n<p class=\"wp-block-paragraph\">GCNs and GATs struggle with generalizing to unseen graphs. The graph structure needs to be the same as the training data. This is known as\u00a0<em>transductive learning<\/em>, where the model trains and makes predictions on the same fixed graph. It is actually overfitting to specific graph topologies. In reality, graphs will change: Nodes and edges can be added or removed, and this happens often in real world scenarios. We want our GNNs to be capable of learning patterns that generalize to unseen nodes, or to entirely new graphs (this is called\u00a0<em>inductive<\/em>\u00a0<em>learning<\/em>).<\/p>\n<h3 class=\"wp-block-heading\">Problem 2. They have scalability issues<\/h3>\n<p class=\"wp-block-paragraph\">Training GCNs and GATs on large-scale graphs is computationally expensive. GCNs require repeated neighbor aggregation, which grows exponentially with graph size, while GATs involve (multihead) attention mechanisms that scale poorly with increasing nodes.<br \/>In big production recommendation systems that have large graphs with millions of users and products, GCNs and GATs are impractical and slow.<\/p>\n<p class=\"wp-block-paragraph\">Let\u2019s take a look at GraphSAGE to fix these issues.<\/p>\n<h2 class=\"wp-block-heading\">GraphSAGE (SAmple and aggreGatE)<\/h2>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/arxiv.org\/pdf\/1706.02216\" target=\"_blank\" rel=\"noreferrer noopener\">GraphSAGE<\/a>\u00a0makes training much faster and scalable. It does this by <em>sampling only a subset of neighbors<\/em>. For super large graphs it\u2019s computationally impossible to process all neighbors of a node (except if you have limitless time, which we all don\u2019t\u2026), like with traditional GCNs. Another important step of GraphSAGE is\u00a0<em>combining the features of the sampled neighbors with an aggregation function<\/em>.\u00a0<br \/>We will walk through all the steps of GraphSAGE below.<\/p>\n<h3 class=\"wp-block-heading\">1. Sampling Neighbors<\/h3>\n<p class=\"wp-block-paragraph\">With tabular data, sampling is easy. It\u2019s something you do in every common machine learning project when creating train, test, and validation sets. With graphs, you cannot select random nodes. This can result in disconnected graphs, nodes without neighbors, etcetera:<\/p>\n<figure class=\"wp-block-image size-full is-resized\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/disconnected.drawio-3.png?ssl=1\" alt=\"\" class=\"wp-image-600802\" style=\"width:680px;height:auto\"><figcaption class=\"wp-element-caption\">Randomly selecting nodes, but some are disconnected. Image by\u00a0author.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">What you\u00a0<em>can<\/em>\u00a0do with graphs, is selecting a random fixed-size subset of neighbors. For example in a social network, you can sample 3 friends for each user (instead of all friends):<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/neighborselection.drawio-5.png?ssl=1\" alt=\"\" class=\"wp-image-600803\"><figcaption class=\"wp-element-caption\">Randomly selecting three rows in the table, all neighbors selected in the GCN, three neighbors selected in GraphSAGE. Image by\u00a0author.<\/figcaption><\/figure>\n<h3 class=\"wp-block-heading\">2. Aggregate Information<\/h3>\n<p class=\"wp-block-paragraph\">After the neighbor selection from the previous part, GraphSAGE combines their features into one single representation. There are multiple ways to do this (multiple\u00a0<em>aggregation functions<\/em>). The most common types and the ones explained in the paper are\u00a0<em>mean aggregation<\/em>,\u00a0<em>LSTM<\/em>, and\u00a0<em>pooling<\/em>.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">With mean aggregation, the average is computed over all sampled neighbors\u2019 features (very simple and often effective). In a formula:<\/p>\n<p class=\"wp-block-paragraph\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1600\/1%2AmvSOnRmQxgY4X-KokooSMw.png?ssl=1\"><\/p>\n<p class=\"wp-block-paragraph\">LSTM aggregation uses an\u00a0<a href=\"https:\/\/www.bioinf.jku.at\/publications\/older\/2604.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">LSTM<\/a>\u00a0(type of neural network) to process neighbor features sequentially. It can capture more complex relationships, and is more powerful than mean aggregation.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">The third type, pool aggregation, applies a non-linear function to extract key features (think about\u00a0<a href=\"https:\/\/paperswithcode.com\/method\/max-pooling\" target=\"_blank\" rel=\"noreferrer noopener\">max-pooling<\/a>\u00a0in a neural network, where you also take the maximum value of some values).<\/p>\n<h3 class=\"wp-block-heading\">3. Update Node Representation<\/h3>\n<p class=\"wp-block-paragraph\">After sampling and aggregation, the node\u00a0<em>combines its previous features with the aggregated neighbor features<\/em>. Nodes will learn from their neighbors but also keep their own identity, just like we saw before with GCNs and GATs. Information can flow across the graph effectively.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">This is the formula for this step:<\/p>\n<p class=\"wp-block-paragraph\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1600\/1%2A8QkSpHp70K4bq1e4EsG1Qg.png?ssl=1\"><\/p>\n<p class=\"wp-block-paragraph\">The aggregation of step 2 is done over all neighbors, and then the feature representation of the node is concatenated. This vector is multiplied by the weight matrix, and passed through non-linearity (for example ReLU). As a final step, normalization can be applied.<\/p>\n<h3 class=\"wp-block-heading\">4. Repeat for Multiple\u00a0Layers<\/h3>\n<p class=\"wp-block-paragraph\">The first three steps can be repeated multiple times, when this happens, information can flow from distant neighbors. In the image below you see a node with three neighbors selected in the first layer (direct neighbors), and two neighbors selected in the second layer (neighbors of neighbors).\u00a0<\/p>\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/contributor.insightmediagroup.io\/wp-content\/uploads\/2025\/03\/graphsage.drawio-4.png?ssl=1\" alt=\"\" class=\"wp-image-600804\"><figcaption class=\"wp-element-caption\">Selected node with selected neighbors, three in the first layer, two in the second layer. Interesting to note is that one of the neighbors of the nodes in the first step is the selected node, so that one can also be selected when two neighbors are selected in the second step (just a bit harder to visualize). Image by\u00a0author.<\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">To summarize, the key strengths of GraphSAGE are its scalability (sampling makes it efficient for massive graphs); flexibility, you can use it for <a href=\"https:\/\/towardsdatascience.com\/tag\/inductive-learning\/\" title=\"Inductive learning\">Inductive learning<\/a> (works well when used for predicting on unseen nodes and graphs); aggregation helps with generalization because it smooths out noisy features; and the multi-layers allow the model to learn from far-away nodes. <\/p>\n<p class=\"wp-block-paragraph\">Cool! And the best thing, GraphSAGE is implemented in\u00a0<a href=\"https:\/\/pyg.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">PyG<\/a>, so we can use it easily in PyTorch.<\/p>\n<h2 class=\"wp-block-heading\">Predicting with GraphSAGE<\/h2>\n<p class=\"wp-block-paragraph\">In the previous posts, we implemented an MLP, GCN, and GAT on the <a href=\"https:\/\/paperswithcode.com\/dataset\/cora\" target=\"_blank\" rel=\"noreferrer noopener\">Cora<\/a>\u00a0dataset (CC BY-SA). To refresh your mind a bit, Cora is a dataset with scientific publications where you have to predict the subject of each paper, with seven classes in total. This dataset is relatively small, so it might be not the best set for testing GraphSAGE. We will do this anyway, just to be able to compare. Let\u2019s see how well GraphSAGE performs.<\/p>\n<p class=\"wp-block-paragraph\">Interesting parts of the code I like to highlight related to GraphSAGE:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">The\u00a0<code>NeighborLoader<\/code>\u00a0that performs selecting the neighbors for each layer:<\/li>\n<\/ul>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">from torch_geometric.loader import NeighborLoader\n\n# 10 neighbors sampled in the first layer, 10 in the second layer\nnum_neighbors = [10, 10]\n\n# sample data from the train set\ntrain_loader = NeighborLoader(\n    data,\n    num_neighbors=num_neighbors,\n    batch_size=batch_size,\n    input_nodes=data.train_mask,\n)<\/code><\/pre>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">The aggregation type is implemented in the\u00a0<code>SAGEConv<\/code>\u00a0layer. The default is\u00a0<code>mean<\/code>, you can change this to\u00a0<code>max<\/code>\u00a0or\u00a0<code>lstm<\/code>:<\/li>\n<\/ul>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">from torch_geometric.nn import SAGEConv\n\nSAGEConv(in_c, out_c, aggr='mean')<\/code><\/pre>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Another important difference is that GraphSAGE is trained in mini batches, and GCN and GAT on the full dataset. This touches the essence of GraphSAGE, because the neighbor sampling of GraphSAGE makes it possible to train in mini batches, we don\u2019t need the full graph anymore. GCNs and GATs do need the complete graph for correct feature propagation and calculation of attention scores, so that\u2019s why we train GCNs and GATs on the full graph.<\/li>\n<li class=\"wp-block-list-item\">The rest of the code is similar as before, except that we have one class where all different models are instantiated based on the\u00a0<code>model_type<\/code>\u00a0(GCN, GAT, or SAGE). This makes it easy to compare or make small changes.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">This is the complete script, we train 100 epochs and repeat the experiment 10 times to calculate average accuracy and standard deviation for each model:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\">import torch\nimport torch.nn.functional as F\nfrom torch_geometric.nn import SAGEConv, GCNConv, GATConv\nfrom torch_geometric.datasets import Planetoid\nfrom torch_geometric.loader import NeighborLoader\n\n# dataset_name can be 'Cora', 'CiteSeer', 'PubMed'\ndataset_name = 'Cora'\nhidden_dim = 64\nnum_layers = 2\nnum_neighbors = [10, 10]\nbatch_size = 128\nnum_epochs = 100\nmodel_types = ['GCN', 'GAT', 'SAGE']\n\ndataset = Planetoid(root='data', name=dataset_name)\ndata = dataset[0]\ndevice = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\ndata = data.to(device)\n\nclass GNN(torch.nn.Module):\n    def __init__(self, in_channels, hidden_channels, out_channels, num_layers, model_type='SAGE', gat_heads=8):\n        super().__init__()\n        self.convs = torch.nn.ModuleList()\n        self.model_type = model_type\n        self.gat_heads = gat_heads\n\n        def get_conv(in_c, out_c, is_final=False):\n            if model_type == 'GCN':\n                return GCNConv(in_c, out_c)\n            elif model_type == 'GAT':\n                heads = 1 if is_final else gat_heads\n                concat = False if is_final else True\n                return GATConv(in_c, out_c, heads=heads, concat=concat)\n            else:\n                return SAGEConv(in_c, out_c, aggr='mean')\n\n        if model_type == 'GAT':\n            self.convs.append(get_conv(in_channels, hidden_channels))\n            in_dim = hidden_channels * gat_heads\n            for _ in range(num_layers - 2):\n                self.convs.append(get_conv(in_dim, hidden_channels))\n                in_dim = hidden_channels * gat_heads\n            self.convs.append(get_conv(in_dim, out_channels, is_final=True))\n        else:\n            self.convs.append(get_conv(in_channels, hidden_channels))\n            for _ in range(num_layers - 2):\n                self.convs.append(get_conv(hidden_channels, hidden_channels))\n            self.convs.append(get_conv(hidden_channels, out_channels))\n\n    def forward(self, x, edge_index):\n        for conv in self.convs[:-1]:\n            x = F.relu(conv(x, edge_index))\n        x = self.convs[-1](x, edge_index)\n        return x\n\n@torch.no_grad()\ndef test(model):\n    model.eval()\n    out = model(data.x, data.edge_index)\n    pred = out.argmax(dim=1)\n    accs = []\n    for mask in [data.train_mask, data.val_mask, data.test_mask]:\n        accs.append(int((pred[mask] == data.y[mask]).sum()) \/ int(mask.sum()))\n    return accs\n\nresults = {}\n\nfor model_type in model_types:\n    print(f'Training {model_type}')\n    results[model_type] = []\n\n    for i in range(10):\n        model = GNN(dataset.num_features, hidden_dim, dataset.num_classes, num_layers, model_type, gat_heads=8).to(device)\n        optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)\n\n        if model_type == 'SAGE':\n            train_loader = NeighborLoader(\n                data,\n                num_neighbors=num_neighbors,\n                batch_size=batch_size,\n                input_nodes=data.train_mask,\n            )\n\n            def train():\n                model.train()\n                total_loss = 0\n                for batch in train_loader:\n                    batch = batch.to(device)\n                    optimizer.zero_grad()\n                    out = model(batch.x, batch.edge_index)\n                    loss = F.cross_entropy(out, batch.y[:out.size(0)])\n                    loss.backward()\n                    optimizer.step()\n                    total_loss += loss.item()\n                return total_loss \/ len(train_loader)\n\n        else:\n            def train():\n                model.train()\n                optimizer.zero_grad()\n                out = model(data.x, data.edge_index)\n                loss = F.cross_entropy(out[data.train_mask], data.y[data.train_mask])\n                loss.backward()\n                optimizer.step()\n                return loss.item()\n\n        best_val_acc = 0\n        best_test_acc = 0\n        for epoch in range(1, num_epochs + 1):\n            loss = train()\n            train_acc, val_acc, test_acc = test(model)\n            if val_acc &gt; best_val_acc:\n                best_val_acc = val_acc\n                best_test_acc = test_acc\n            if epoch % 10 == 0:\n                print(f'Epoch {epoch:02d} | Loss: {loss:.4f} | Train: {train_acc:.4f} | Val: {val_acc:.4f} | Test: {test_acc:.4f}')\n\n        results[model_type].append([best_val_acc, best_test_acc])\n\nfor model_name, model_results in results.items():\n    model_results = torch.tensor(model_results)\n    print(f'{model_name} Val Accuracy: {model_results[:, 0].mean():.3f} \u00b1 {model_results[:, 0].std():.3f}')\n    print(f'{model_name} Test Accuracy: {model_results[:, 1].mean():.3f} \u00b1 {model_results[:, 1].std():.3f}')\n<\/code><\/pre>\n<p class=\"wp-block-paragraph\">And here are the results:<\/p>\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-markdown\">GCN Val Accuracy: 0.791 \u00b1 0.007\nGCN Test Accuracy: 0.806 \u00b1 0.006\nGAT Val Accuracy: 0.790 \u00b1 0.007\nGAT Test Accuracy: 0.800 \u00b1 0.004\nSAGE Val Accuracy: 0.899 \u00b1 0.005\nSAGE Test Accuracy: 0.907 \u00b1 0.004<\/code><\/pre>\n<p class=\"wp-block-paragraph\">Impressive improvement! Even on this small dataset, GraphSAGE outperforms GAT and GCN easily! I repeated this test for CiteSeer and PubMed datasets, and always GraphSAGE came out best.\u00a0<\/p>\n<p class=\"wp-block-paragraph\">What I like to note here is that GCN is still very useful, it\u2019s one of the most effective baselines (if the graph structure allows it). Also, I didn\u2019t do much hyperparameter tuning, but just went with some standard values (like 8 heads for the GAT multi-head attention). In larger, more complex and noisier graphs, the advantages of GraphSAGE become more clear than in this example. We didn\u2019t do any performance testing, because for these small graphs GraphSAGE isn\u2019t faster than GCN.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-dotted\">\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p class=\"wp-block-paragraph\">GraphSAGE brings us very nice improvements and benefits compared to GATs and GCNs. Inductive learning is possible, GraphSAGE can handle changing graph structures quite well. And we didn\u2019t test it in this post, but neighbor sampling makes it possible to create feature representations for larger graphs with good performance.\u00a0<\/p>\n<h3 class=\"wp-block-heading\">Related<\/h3>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/towardsdatascience.com\/optimizing-connections-mathematical-optimization-within-graphs-7364e082a984\"><strong>Optimizing Connections: Mathematical Optimization within Graphs<\/strong><\/a><\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-1-graph-convolutional-networks-explained-9c6aaa8a406e\"><strong>Graph Neural Networks Part 1. Graph Convolutional Networks Explained<\/strong><\/a><\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-2-graph-attention-networks-vs-gcns-029efd7a1d92\"><strong>Graph Neural Networks Part 2. Graph Attention Networks vs. GCNs<\/strong><\/a><\/p>\n<\/blockquote>\n<p>The post <a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-3-how-graphsage-handles-changing-graph-structure\/\">Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure<\/a> appeared first on <a href=\"https:\/\/towardsdatascience.com\/\">Towards Data Science<\/a>.<\/p>\n<\/div>\n<p> \t<BR><br \/>\n <BR><\/BR><br \/>\n    Hennie de Harder<br \/>\n \t<BR><br \/>\n<BR><\/BR><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/graph-neural-networks-part-3-how-graphsage-handles-changing-graph-structure\/\">Go to original source<\/a><br \/>\n \t<BR><br \/>\n <BR><\/BR><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Graph Neural Networks Part 3: How GraphSAGE Handles Changing Graph Structure In the previous parts of this series, we looked at Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs). Both architectures work fine, but they also have some limitations! A big one is that for large graphs, calculating the node representations with GCNs and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[62,69,240,2226,2227,2228,2229],"tags":[2231,2230,339],"class_list":["post-2773","post","type-post","status-publish","format-standard","hentry","category-aimldsaimlds","category-artificial-intelligence","category-editors-pick","category-graphsage","category-inductive-learning","category-large-graphs","category-node-representation","tag-gats","tag-gcns","tag-graph"],"_links":{"self":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2773"}],"collection":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/comments?post=2773"}],"version-history":[{"count":0,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/posts\/2773\/revisions"}],"wp:attachment":[{"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/media?parent=2773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/categories?post=2773"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mailitics.com\/index.php\/wp-json\/wp\/v2\/tags?post=2773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}