## Abstract

Graph representations are traditionally used to represent protein structures in sequence design protocols where the folding pattern is known. This infrequently extends to machine learning projects: existing graph convolution algorithms have shortcomings when representing protein environments. One reason for this is the lack of emphasis on edge attributes during massage-passing operations. Another reason is the traditionally shallow nature of graph neural network architectures. Here we introduce an improved message-passing operation that is better equipped to model local kinematics problems such as protein design. Our approach, XENet, pays special attention to both incoming and outgoing edge attributes.

We compare XENet against existing graph convolutions in an attempt to decrease rotamer sample counts in Rosetta’s rotamer substitution protocol. This use case is motivating because it allows larger protein design problems to fit onto near-term quantum computers. XENet outperformed competing models while also displaying a greater tolerance for deeper architectures. We found that XENet was able to decrease rotamer counts by 40% without loss in quality. This decreased the problem size of our use case by more than a factor of 3. Additionally, XENet displayed an ability to handle deeper architectures than competing convolutions.

**Author summary** Graphs data structures are ubiquitous in the field of protein design and are at the core of the recent advances in artificial intelligence brought forth by graph neural networks (GNNs). GNNs have led to some impressive results in modeling protein interactions, but are not as common as other tensor representations.

Most GNN architectures tend to put little to no emphasis on the information stored on edges; however, protein modeling tools often use edges to represent vital geometric relationships about residue pair interactions. In this paper, we show that a more advanced processing of edge attributes can lead to considerable benefits when modeling chemical data.

We introduce XENet, a new member of the GNN family that is shown to have improved ability to model protein residue environments based on chemical and geometric data. We use XENet to intelligently simplify the optimization problem that is solved when designing proteins. This task is important to us and others because it allows larger proteins to be designed on near-term quantum computers. We show that XENet is able to train on our protein modeling data better than existing methods, successfully resulting in a dramatic decrease in protein design sample space with no loss in quality.

## Introduction

Protein design involves astronomically large search problems beyond the capabilities of even the largest supercomputers. [1] Current computational methods make use of stochastic search algorithms such as simulated annealing to handle this large space. [2] This task traditionally involves assuming a static protein backbone and representing all candidate sidechain conformations and identities as discrete possibilities called “rotamers”. [3, 4] A single sequence position on the protein can have hundreds of candidate rotamers when spanning all twenty native amino acids.

Quantum computing offers a new alternative for solving these complex tasks to power the development of new protein-based therapeutics and enzymes of industrial interest. [5] In previous work, we demonstrated how the protein design problem can be expressed as a combinatorial optimization problem and solved using quantum annealing hardware and hybrid quantum-classical solvers. [6] Critically, we were able to show the system’s applicability to real-world protein design problems without reducing the complexity of the problem.

This method used the Rosetta software suite to model these backbone-dependent rotamers and to calculate the one- and two-body interactions between them [7–9]. Our goal was to find the set of rotamers that minimizes the protein’s computed energy, measured in Rosetta Energy Units (REU). Rosetta does this using simulated annealing, in a process called “packing” and “rotamer substitution” [4, 10].

Mapping large protein design problems directly to quantum hardware was limited by a number of factors including noise and the number of qubits available. Even using a hybrid solver proved impractical for large problems as noise and time constraints effectively placed an upper barrier to the size of problems that could be solved. Additionally, we have evidence that the modeling of some atomic interactions, like hydrogen bonds, would be improved with a finer granularity of rotamer sampling, suggesting that our problem has reason to grow even larger [11].

Our goal for this project was to use machine learning to adaptively decrease sample space for arbitrary protein design problems by eliminating rotamers from consideration. Scientists are having rapidly-increasing success using artifical neural networks to design proteins using a variety of representations [12, 13]. We have recently seen success representing proteins by passing contact maps into image-inspired 2D convolutions [14, 15], 3D convolutions on voxelized representations [16, 17], and even language models on protein sequences [18–20]. The representation that interests us the most is the graph-based representation found in graph neural networks [21–23].

Graphs are intuitive representations for protein modeling cases in which the backbone structure is already established, as it is in protein design. In fact, traditional protein modeling tools such as Rosetta use graphs internally to model interactions during their own protocols [7, 24–26]. These residue-centric graphs represent each sequence position as a node, with edges connecting positions that are close in 3D space. Node attributes generally encode the residue’s backbone geometry and possibly some representation of its sidechain identity. Edge attributes are used to model the interactions and geometry between residue positions.

Graph neural networks (GNNs) are a class of machine learning models designed to process graph-structured data. While the seminal research on GNNs dates back to the works of Sperduti *et al*. [27], Gori *et al*. [28], and Scarselli *et al*. [29], recent research efforts have led to a rapid growth of the field and have achieved state-of-the-art results on a large variety of applications, ranging from social networks [30–32], to chemistry [33, 34], biology [21, 35, 36], and physics [37].

The growth of the field has led to the development of many diverse GNN architectures, notably including the works in references [38–43]. Of particular interest to this work are those models that can be expressed as message-passing architectures [44]. In particular, message-passing GNNs act on the node attributes of a graph according to the following general scheme:
where *ϕ* is a *message* function that depends on the graph’s node and edge attributes (resp. **X** and **E**), □ is any permutation-invariant operation that aggregates messages coming from the neighborhood of *i*, and *γ* is an *update* function (see our Notation section on the next page for the remaining symbols). Intuitively, message-passing GNNs transform the attributes of the graph by exchanging information between neighboring nodes.

While the definition of Eq. (1) allows the message function to depend on the edge attribute between a node and its neighbor, the majority of GNN architectures are designed for non-attributed edges. Among those GNNs that are designed to process edge attributes, we mention the Edge-Conditioned Convolutions (ECCs) introduced by Simonovsky and Komodakis [45]. ECCs make use of an auxiliary model called a *filter-generating network* (FGN) that takes as input edge attributes and produces output parameters that replace what conventionally would be the learnable parameters of *ϕ* in Eq. (1) that would ordinarily be fixed. ECCs can bring significant advantages when processing graphs for which edge attributes are important and have been used to process molecular graphs [46]. However, the FGN can be difficult to train due to the absence of a strong supervision signal (which is particularly difficult to achieve when stacking many layers) and ECCs are mostly effective in processing edge attributes with a one-hot representation.

In recent years, other types of GNNs have been proposed that process edge attributes directly in the message function, without relying on a FGN. These usually concatenate [47] or sum [48] the edge attributes to the node attributes of the neighbors. In particular, here we consider the work of Xie *et al*. [47], based on concatenation, which we denote as *CrystalConv* in the following.

We note, however, that all of the methods mentioned above suffer from two key issues. First, none of them are designed to take into account the case of symmetric directed graphs with asymmetric edge attributes (*i.e*., graphs for which the existence of edge (*i, j*) implies the existence of edge (*j, i*) and *vice versa*, but the corresponding attributes can differ). This is particularly relevant for our work due to the geometric nature of our edge attributes: our edges themselves have no directionality but nearly every edge feature has some degree of asymmetry. Second, most existing methods are not designed to update edge attributes, which are considered as static inputs throughout the network. The updating of edge attributes is not a novel idea *per se*, since it was proposed both in the Graph Network model by Battaglia *et al*. [49] and in the Typed Graph Network of Prates *et al*. [50] (where both are works that attempt to unify GNNs in a similar spirit to the message passing framework), but to the best of our knowledge it is seldom applied in practice.

Here we propose XENet, a GNN model that addresses both concerns while also avoid the computational issues introduced by FGNs. XENet is a message-passing GNN that simultaneously accounts for both the incoming and outgoing neighbors of each node, such that a node’s representation is based on the messages it receives as well as those it sends. We demonstrate XENet’s advantage over ECC and CrystalConv by testing their abilities to eliminate rotamer candidates in real-world protein design problems.

## Materials and methods

### Notation

Let a graph be a tuple , with node set and edge set s.t. is a directed edge from node *i* to node *j*. Additionally, let indicate a vector attribute associated with node *i* and let indicate a vector attribute associated with edge (*i*, *j*). We indicate the neighborhood of a node with . Note that in our case we consider symmetric directed graphs, so that the incoming and outgoing neighbors of a node coincide.

To make notation more compact, in the following we denote with the matrix of node attributes, with the matrix of edge attributes (we assume the entries of this matrix to be zero if the corresponding edge does not exist), and with **A** ∈{0,1}^{N×N} the binary adjacency matrix of the graph.

### XENet

Our architecture, which we refer to as XENet (due to its ability to convolve over both **X** and **E** tensors), is described by the following Equations:
where *φ*^{(s)}, *φ*^{(n)}, *φ*^{(e)}, are multi-layer perceptrons with Parametric Rectified Linear Unit activations [51], and where *a*^{(out)} and *a*^{(in)} are two dense layers with sigmoid activations and a single scalar output.

The core of XENet lies in the computation and aggregation of the *feature stacks* **s**_{ij} in Eqs. (2)–(4). These are obtained by concatenating the node and edge attributes associated with the incoming and outgoing messages (Eq. (2)), so that the multi-layer perceptron *φ*^{(s)} learns to process the two directions separately. The feature stacks are also aggregated separately in the two directions of the flow, using self-attention [52] to compute a weighted sum (Eqs. (3)–(4)). The separate representations are concatenated and used to update the node attributes of the graph (Eq. (5)). Finally, some additional processing of the feature stacks through *φ*^{(e)} lets us compute new edge attributes that are dependent on the message exchange between nodes (Eq. (6)).

### Generating FixbbGCN Training Data

Here we prepare to apply XENet to a specific protein design problem, as described later in the paper. Our goal is to create a GNN that can analyze an intermediate protein state of FastDesign and predict which rotamers are likely to be sampled in the next round of rotamer substitution. We call this trained network “FixbbGCN”.

We used an arbitrary subset of structures from the Top8000 dataset for training [53], which ensures that no two protein structures have high similarity. Our training set used 967 structures (total of 229,776 residue positions) and our validation set used 239 structures (57,584 residue positions). The number of structures we used simply depended on how much CPU time we were willing to commit for generating data.

We ran 5 repeats of the MonomerDesign2019 variant of Rosetta’s FastDesign [54] protocol on each structure but only collected training data for the final 4 repeats. We set Rosetta to generate a larger number of more finely-discretized rotamers by passing the ‘-ex1 -ex2’ commandline flags and used Rosetta’s REF2015 energy function [9]. This accounts for 16 of the 20 rounds of rotamer substitution, though for this project we only use the data from 4 of the 16 rounds due to score function ramping [54]. We therefore ended up with 919,104 training set elements (229,776 residues x 4 rounds per residue) and 230,336 validation elements.

For this project, rotamers from the 20 amino acids were binned into 54 categories. Alanine, Proline, and Glycine each had their own bin due to their lack of meaningful *χ*1 attributes. The remaining 17 canonical amino acids had three bins each, which correspond to the three *χ*1 wells.

For each round of rotamer substitution, we tracked the fraction of time that each rotamer was the representative state for its residue position. At the end of the run, any rotamer bin that held the representative state for more than 0.1% of the run was classified as a 1. All other rotamer bins were classified as a 0. Note that this resulted in a multi-label classification problem where every sample was associated with one or more classes. We also ignored data from the fraction of the simulated annealing trajectories where the simulated temperature was above 3 Rosetta Temperature Units (3 REU is intended to correspond with 3 kcal/mol).

### FixbbGCN Architecture

We refer to this family of networks as FixbbGCN, as the Rosetta rotamer substitution protocol is sometimes called “fixbb”. FixbbGCN is schematically represented in Fig 1. The model has three input tensors for **X**, **A**, and **E**. The maximum number of nodes per graph representation is *N* = 30, the number of attributes per node is *F* = 46, and the number of attributes per edge is *S* = 28. The output of the model is a 54-dimensional vector which holds one value for each of the rotamer bins described in the “Generating Training Data” section.

For all models, the **X** and **E** tensors are first fed to dense layers. These fully-connected layers only process one node/edge at a time, so that no information flows between nodes or edges. We then apply one or more steps of message passing, using either XENet, CrystalConv, or ECC layers. We used the Spektral package’s implementation of the latter two layers. [55]

Fig 1 shows two rounds of message passing but we tested all models with one, two, and three layers (some XENet models were tested up to five layers, as reported in the SI). We note that the output tensor **E** from the final round of XENet is never be used by a future layer. The subset of parameters used to build this final **E** will be implicitly omitted when we tally trainable parameters.

We set FixbbGCN up as a single-node classification problem as opposed to a graph classification problem. Thus, after the message-passing stage, we focus on the unique node that represents the residue of interest being evaluated. We concatenate the output from the final message-passing layer with the original input **X** tensor in an effort to compensate the over-smoothing effect of message passing. We then crop the **X** tensor to only include the node that represents the protein residue of interest. FixbbGCN finishes off by running that single node’s data through two more fully-connected layers.

All dense and message-passing layers have ReLU activation functions except for the final dense layer which has a sigmoid activation.

#### Hidden Layer Sizes

We benchmarked two XENet candidates as outlined in Table 1. XENet (s) is sized to have the same hidden layer size as the ECC models. XENet (p) is sized to have the same number of trainable parameters as the ECC models. We tuned these parameters by changing F_{h} and S_{h}, which are the number of channels for the hidden X and E layers, respectively, before the cropping layer. The penultimate dense layer always has 100 channels and the final layer always has 54 channels.

Likewise, we benchmarked two CrystalConv models using the same normalization techniques. The parameter normalization was not perfect but we got as close as possible without varying hyperparameters between depths of the same type.

Each XENet layer always used two internal stacking layers with S_{h} channels each. In other words, the *φ*^{(s)} multi-layer perceptrons always had a depth of two.

#### Node and Edge Attributes

Our input data had 46 node attributes and 28 edge attributes, all of which are listed in the Supporting Information. Most of these attributes are direct physical characteristics of residues and physical relationships of residue pairs. We also included more advanced analytics in the form of Rosetta score terms.

Many of these attributes require access to the pyrosetta package to compute. [56] These include the Rosetta score terms, hydrogen bond identification, and the residue pair “jump” measurements. A Rosetta “jump” describes the six-dimensional rigid body relationship between the coordinate frames of two protein residues based on their backbone atoms.

#### MentenGCN Package

We have created a public Python package in an effort to make protein processing with GNNs more portable and easier to share. MentenGCN [57] has a library of tensor decorators that were used for this project to generate the **X**, **A**, and **E** input tensors directly from Rosetta’s protein representation. The configuration class for the GNN used in this paper is available within the MentenGCN package under the name `“Maguire_Grattarola_2021”`. Please refer to the Supporting Information section for more detail on how to access this feature.

### Training and Evaluating FixbbGCN Models

Each model configuration was trained between 6 and 12 times, loosely depending on the amount of resources required to train each model. We show later that the performance of a given architecture generally has narrow variance so we did not see the need to expand this sampling.

Each model was trained using Keras’s implementation of the Adam optimizer with a starting learning rate of 0.001 and the binary crossentropy loss function [58, 59]. The learning rate was reduced by a factor of 10 whenever the validation loss plateaued for 2 consecutive epochs (`min_delta=0.001`). Training was halted whenever the validation loss plateaued for 5 consecutive epochs. We evaluated all models with binary crossentropy and Receiver Operating Characteristic (ROC) area-under-curve (AUC) on our validation set.

### Benchmarking FixbbGCN Implementation On Classical Computer

As we will show in the Results section, the best model observed was XENet (p) with 3 layers. We benchmarked the applicability of this model by using it alongside Rosetta’s packing protocol on six backbones of various sizes. For each backbone, we ran each residue position through our model and compared the 54 final values against a tuneable cutoff. Rotamers were eliminated if the final value for their respective bin fell below the cutoff. We performed this benchmark with a range of cutoffs between 0 and 1. We also included a cutoff of −1.0 as a control (so that no rotamers were eliminated, since the sigmoid activation has a minimum of 0). The larger the cutoff, the more aggressively rotamers were eliminated. We ran each cutoff on each structure 10 times and tracked the final Rosetta score in units of Rosetta Energy Units (REU) where more negative is better.

The Protein Data Bank codes for the six backbones used for this benchmark are 1SFX, 1ECO, 1D4O, 1W2C, 1O4S, and 1PJ5 in order of increasing size. All six of these structures are also from the top8000 dataset [53] so they are expected to have low homology with the training and validation data used to train the model. Staying consistent with the training data collection, Rosetta built rotamers with the “-ex1 -ex2” commandline flags and used Rosetta’s REF2015 score function [9].

### Benchmarking FixbbGCN Implementation On Quantum Computer

This quantum benchmark used all of the same Rosetta parameters and FixbbGCN cutoffs as the classical benchmark. We could not fit the previous test cases on the quantum machine so we used a subset of the smallest problem (protein data bank code: 1SFX). We used Rosetta’s LayerSelector tool to design the 10 residues in the core of the protein. [60] All other residue positions were held immutable, decreasing our maximum rotamer count from 63183 to 5686.

Our quantum rotamer sampling protocol was identical to that described in Mulligan *et al*. [6] Like the classical benchmark, we ran 10 annealing trajectories for each FixbbGCN cutoff and reported the mean and standard deviation across those 10 samples. We also measured Random Access Memory (RAM) usage for each problem size. The RAM usage is expected to scale quadratically with rotamer count due to the need to calculate all residue pair energies between neighboring sequence positions.

## Results and Discussion

### FixbbGCN Model Comparisons

Our goal for this test was to find the graph convolution that would best represent our protein modeling data. XENet is our attempt to engineer a new GNN layer that makes further use of the edge tensors, including updating their features as the result of the convolution. As baseline model for this experiment we considered ECC, since it is one of the first and most widely used GNNs designed to process edge attributes, and we compare it against different configurations of CrystalConv and XENet to ensure a fair comparison. XENet (s) and CrystalConv (s) are normalized by the channel depth of each hidden layer. XENet (p) and CrystalConv (p) are normalized by the trainable parameter count.

The models were tasked with a multi-label classification problem to predict which protein sidechain rotamers would be sampled at a given sequence position during a round of Rosetta’s rotamer subsitution protocol with simulated annealing. [10] We see in Table 2 that the XENet models outperform their ECC and CrystalConv counterparts, although some of the CrystalConv models are in close competition with the best XENet models. In addition to having better loss and AUC scores, XENet convolutions appear to perform better with deeper architectures. XENet slightly improves when the third graph convolution layer is introduced, whereas ECC and CrystalConv exhibit a consistent drop in performance at that depth.

The reasons for these differences in performance can be readily motivated by considering the differences between the models themselves. First, ECC’s FGN is an indirect way of processing edge attributes and requires a strong supervision signal in order to be trained effectively, which may not be easy to attain especially within deeper architectures. Second, ECC was often shown to be most effective when processing data with one-hot encoded attributes [45, 46], which is not the case here.

Since CrystalConv does not use a FGN to process the edges, it does not have the same problems as ECC and its performance is more in line with XENet’s. However, the asymmetric processing of XENet, paired with its ability to update edge attributes to obtain a richer representation, make it more suitable for this particular type of data and results in a better overall performance in all configurations.

We show in Fig 2 that XENet can even handle depths of 4 and 5 GNN layers. The additional layers did not give us an advantage in validation loss; however, deeper architectures will theoretically be more advantageous for use cases that require more expansive message passing than our benchmark. For this reason, the mere ability to handle deeper architectures may prove to be a strength of XENet. XENet did encounter occasional failures with the deeper architectures but the majority of deeper models finished with competitive validation losses. We did not test CrystalConv or ECC with architectures of 4 or 5 layers due to their lack of success with 3 layers.

### Quantum FixbbGCN Benchmark

Now that we have these trained models, we want to see how much they can decrease the sizes our quantum annealing use cases. We wrapped the best model for each architecture in Rosetta rotamer-elimination machinery and named it FixbbGCN (“fixbb” is a popular name for Rosetta’s fixed-backbone packing protocol).

We cannot run full-sized quantum benchmarks for the same reason that this project was motivated: our protein design benchmarks are too large to be run on the quantum computers. The best we can currently do is use FixbbGCN to design a subset of the protein on the quantum annealer and save the larger problems for the classical benchmark presented later in the article.

For this test, we needed a very small problem size. We took the smallest test case from our benchmark set but restricted sampling to only include the core of the protein. We used Rosetta’s definition of the core, which identified 10 residue positions that were sufficiently isolated from solvent exposure.

We chose this benchmark because the core is the most combinatorially challenging part of the protein to design. Rosetta samples core rotamers more finely than solvent-exposed residues so the rotamer count per position is higher. Additionally, these residue positions tend to have more neighbors, resulting in a more complex energy optimization problem.

XENet shows in Fig 3 an ability to decrease the rotamer count to roughly 60% before the dip in Rosetta score appears. ECC drops in quality near 70% and CrystalConv drops near 64%.

We did not report runtime for this benchmark because we had no way to decouple time spent running the annealer from time spent sending our data over the internet and waiting in the quantum computer’s queue. We do expect that runtime will correlate linearly with RAM usage as both have quadratic relationships with the rotamer count.

Using RAM as our guide, XENet is able to reduce our problem size to 32% before the decrease in design quality appears. The CrystalConv model came close with a decrease to 36% and the ECC only model shrunk the problem to 43%.

### Classical FixbbGCN-XENet Benchmark

The goal for the final benchmark was to assess to what extent XENet’s pattern observed in the quantum benchmark persists for full-sized use cases. Unfortunately, these full-sized design cases are too large for us to run on quantum computers so we ran these benchmarks using Rosetta’s simulated annealer. This is the best we can do with current technology but hopefully a more complete test will be possible someday.

Similar to the quantum benchmark, this benchmark applies the XENet classifier with various cutoffs to Rosetta’s set of rotamers for six different protein design problems. This time, however, the entire protein structures are being designed. Rotamers are pruned if their predicted value from the classifier is below the cutoff. The “control” data point with the largest rotamer count for a given use case is the standard Rosetta packing protocol with no influence from the classifier.

We see in Fig 4 that we can use FixbbGCN to decrease the number of rotamers without a loss in design quality to a limited extent. The Rosetta score will generally stay in range of the control data down to the range of 55-60% of the original rotamer count.

The results in Fig 4 supports the idea that FixbbGCN’s ability to eliminate rotamers for small problem sizes translates to larger problem sizes too. It is up to the user to decide how risky they want to be with FixbbGCN, but our results suggests that decreasing rotamer counts to roughly 60% is safe.

## Conclusion

Graph neural networks have great potential for modeling residue-level protein interactions. We show that our new convolution, XENet, can model residue-level environments better than existing methods ECC and CrystalConv. Not only does the usage of XENet result in lower validation losses, but we show that XENet can withstand deeper architectures.

To demonstrate XENet’s value, we use it to create a tool capable of fitting larger protein design problems onto quantum computers by eliminating sidechain conformations that are unlikely to be selected by an annealer. XENet was consistently able to reduce rotamer counts by 40% without loss in design quality. As a result, we measured a 68% decrease in total problem size, which has a quadratic relationship with rotamer count.

## Supporting information

## Quantum Benchmark Results

### XENet

### ECC

### CrystalConv

## Node and Edge attributes for FixbbGCN

These attributes can be reproduced with the following python code

# pip install menten-gcn import menten_gcn as mg data_maker = mg.published.Maguire_Grattarola_2021() data_maker.summary() # This check ensures that the data_maker gives the expected values data_maker.run_consistency_check() # Visit https://menten-gcn.readthedocs.io/ to see # how to use this data_maker with your protein## Raw Data

This section attempts to comply with PLOS Computational Biology’s data policy. We provide all individual data points that are only summarized as means in the main text.

### Downloadables

Raw training data is publicly available at https://menten-ai-public.s3.us-east-2.amazonaws.com/Maguire-XENet-2021/all_training_data.tar.gz

Raw testing data is publicly available at https://menten-ai-public.s3.us-east-2.amazonaws.com/Maguire-XENet-2021/all_testing_data.tar.gz

Our best ECC model (used for quantum benchmark) is available in Keras h5 format at https://menten-ai-public.s3.us-east-2.amazonaws.com/Maguire-XENet-2021/best_ECC.h5

Our best CrystalConv model (used for quantum benchmark) is available in Keras h5 format at https://menten-ai-public.s3.us-east-2.amazonaws.com/Maguire-XENet-2021/best_CrystalConv.h5

Our best XENet model (used for quantum and classical benchmarks) is available in Keras h5 format at https://menten-ai-public.s3.us-east-2.amazonaws.com/Maguire-XENet-2021/best_XENet.h5

### Training Losses

Each table below lists the data points for the means and standard deviations reported in Table 2 (columns 3 and 4) of the main text.

### AUCs

Each table below lists the data points for the means and standard deviations reported in Table 2 (columns 5 and 6) of the main text.

### Quantum Benchmark Scores

## Acknowledgments

We thank Dr. Andrew Leaver-Fay and Brian Coventry for various Rosetta developments that made our workflow easier and Sergey Lyskov and Dan Farrell for their assistance in overcoming technical hurdles. We also thank D-Wave Systems, Inc. for useful discussions and support.

## Abbreviations

- AUC
- Area Under Curve (used with respect to ROC)
- ECC
- Edge-Conditioned Convolution
- FGN
- Filter-Generating Network
- GCN
- Graph Convolutional Network
- GNN
- Graph Neural Network
- RAM
- Random Access Memory
- REU
- Rosetta Energy Units
- ROC
- Receiver Operating Characteristic