Where the really hard choices are: A general framework to quantify decision difficulty

Current models of decision-making more often than not ignore the level of difficulty of choices or treat it only informally. Yet, difficulty has been shown to affect human decision quality. We propose instance complexity (IC), a measure of computational resource requirements, as a generalisable framework to quantify difficulty of a choice based on a small number of properties of the choice. The main advantage of IC compared to other measures of difficulty is fourfold. Firstly, it is based on the theory of computation, a rigorous mathematical framework. Secondly, our measure captures complexity that is intrinsic to a decision task, that is, it does not depend on a particular solution strategy or algorithm. Thirdly, it does not require knowledge of a decision-maker’s attitudes or preferences. And lastly, it allows computation of difficulty of a decision task ex-ante, that is, without solving the decision task. We tested the relation between IC and (i) decision quality and (ii) effort exerted in a decision using two variants of the 0-1 knapsack problem, a canonical and ubiquitous computational problem. We show that participants exerted more effort on instances with higher IC but that decision quality was lower in those instances. Together, our results suggest that IC can be used as a general framework to measure the inherent complexity of decision tasks and to quantify computational resource requirements of choices. The latter is particularly relevant for models of resource allocation in the brain (meta-decision-making/cognitive control). Our results also suggest that existing models of decision-making that are based on optimisation (rationality) as well as models such as the Bayesian Brain Hypothesis, are computationally implausible.


Introduction
We propose that computational complexity theory (CCT) provides a general the-32 oretical framework that lends itself to characterising difficulty of decisions. CCT is a 33 branch of computing theory that studies the computational resource requirements for 34 solving a task [20][21][22]. Traditionally, CCT has been used to characterise complexity of 35 computational problems. An example of a computational problem is sorting of an array. 36 Other well-studied computational problems include the travelling salesman problem, the 37 subset sum problem or the satisfiability problem. An instance of a problem is a particu-38 lar case of the problem, for example, a particular array of numbers to be sorted. The 39 traditional way of defining the computational complexity of problems is only of limited 40 use for the study of decision-making for various reasons. Firstly, the approach measures 41 the complexity of problems by studying how efficiently problems can be solved as they 42 increase in size. This is done by considering how computational resource requirements 43 scale, in the worst case, given the input size of the problem. Using the example of array 44 sorting, problem complexity is concerned with the growth of computational resource 45 requirements (e.g., number of computational steps, memory), in the worst case, as a 46 function of the size of the initial array. Secondly, it ignores the fact that instances of 47 a problem with a fixed input size can vary vastly in terms of computational resource 48 requirements. For example, sorting an array that is already in the desired order will 49 tend to take less time than sorting an array that isn't. 50 We propose that instance complexity theory (IC), a related framework, is more 51 useful for characterising difficulty of decisions. The aim of instance complexity is the 52 characterisation of the computational complexity of individual instances of a problem, 53 based on an instance's properties. For example, in the case of array sorting, it would be 54 based on properties of the input array. IC theory achieves this aim without reference 55 to a particular algorithm or model of computation [23][24][25]. Thus, it is considered to 56 characterise the inherent computational complexity of instances. Moreover, IC has been 57 shown to be applicable to a wide range of problems including the hamiltonian circuit 58 problem [23], the graph colouring problem [23], the travelling salesman problem [26], 59 the knapsack decision problem [27], and the K-SAT problems (boolean satisfiability 60 problems) [23,24,28]. These results suggest that the theory is general. 61 Here, we use IC to characterise the computational complexity of instances of the 0-1 62 knapsack decision problem. The problem involves selecting a subset from n items with 63 which to fill a knapsack (rucksack) with a specified weight capacity c and a target profit 64 p. Each item has a weight w and a value v. The aim is to decide if there is a subset A of 65 the items for which (1) the sum of weights ( i∈A w i ) is lower or equal to the capacity c 66 and (2) the sum of values ( i∈A v i ) yields a target profit p (see S1 Appendix). 67 The knapsack problem is ubiquitous in everyday life. It is present in problems 68 involving choice of stimuli to attend to, budgeting and time management, portfolio 69 optimisation, intellectual discovery as well as in industrial applications such as the cargo 70 business [29][30][31]. The problem can also be used to model the symptoms of certain mental 71 disorders such as attention-deficit/hyperactivity disorder [31]. Additionally, the knapsack 72 problem has been widely studied. Not only does there exist a wide range of algorithms 73 to solve the knapsack problem and its extensions. The computational complexity of the 74 problem has been investigated extensively [27,29]. 75 To apply IC to the knapsack problem, we exploit an important mathematical and 76 statistical property of the problem. When sampling a random instance, the probability 77 that the correct answer to the instance is 'yes' (henceforth solvable) can be calculated 78 based on a small set of characteristics of the instance itself [27]. This solvability probability 79 exhibits a phase transition, that is, an abrupt shift between 0 and 1 within a narrow range 80 of instance parameters [27]. This boundary separates instances of the problem into two 81 regions: an under-constrained region where the constraints are lenient, and thus many 82 solutions are likely to exist, and an over-constrained region where the constraints are 83 stringent, and thus the existence of a solution is unlikely. Instances in the proximity of 84 this boundary have substantially higher computational complexity than instances further 85 away from it (Fig 1a). This means that there is a mapping from instance characteristics 86 to computational complexity of the instance. We use this mapping as a basis to define 87 IC for the knapsack problem.

88
In the present study, we tested whether IC thus defined predicts both effort exerted 89 and decision quality in an instance. To this end, we conducted an experiment in which 90 twenty participants each completed two variants of the 0-1 knapsack problem, the 91 decision and the optimisation variant. The optimisation variant differs from the former 92 in that the aim is to maximise the value of the items in the knapsack given a capacity 93 constraint (see S1 Appendix). The two tasks are representatives, respectively, of the two 94 main classes of computational problems, decision problems and optimisation problems. 95 Probability of an instance being solvable as a function of the natural logarithm of the normalised profit to normalised capacity ratio (left axis), and compute time proxy (number of propagations using the Gecode solver) to solve an instance (right axis). The values correspond to the knapsack decision problem with 6 items. (b) Instance sampling for the behavioural experiment. Each point is an instance sampled as a function of the number of propagations and the natural logarithm of the normalised profit to normalised capacity ratio. Equal number of instances were sampled from each of the four regions: (i) overconstrained region, (ii) underconstrained region, and high IC region with a compute time proxy (iii) higher than the median of those instances within the high IC region and (iv) lower than the median of those instances within the high IC region. (c) Human performance by region in the Knapsack Decision Task. Mean computational performance and standard errors. Note: * p<0.1; * * p<0.05; * * * p<0.01; NS: not significant.
We predicted that performance would be lower in those instances with high IC in both 96 variants. Moreover, we anticipated effort exerted to be positively correlated with IC.

98
Knapsack Decision Task 99 Task structure In this task, participants (n = 20) were asked to solve a number of 100 instances of the (0-1) knapsack decision problem. In each trial, they were shown a set 101 of items with different values and weights as well as a capacity constraint and a target 102 profit. Participants had to decide whether there exists a subset of those items for which 103 (1) the sum of weights is lower or equal to the capacity constraint and (2) This stage lasted 3 seconds. Then, both capacity constraint and target profit were shown at the centre of the screen. Participants had to decide whether there exists a subset of the items for which (1) the sum of weights is lower or equal to the capacity constraint and (2) the sum of values yields at least the target profit. This stage lasted 22 seconds. Finally, participants had 2 seconds to make either a 'YES' or 'NO' response using the keyboard. A fixation cross was shown during the inter-trial interval (5 seconds). (b) Knapsack Optimisation Task. Participants saw a set of items of different values and weights together with a capacity constraint shown at the centre of the screen. The green circle at the centre of the screen indicated the time remaining in this stage of the trial. Participants had to find the subset of items with the highest total value subject to the capacity constraint. This stage lasted 60 seconds. Participants selected items by clicking on them and had the option of submitting their solution before the time limit was reached. After the time limit was reached or they submitted their solution a fixation cross was shown for 10 seconds before the next trial started.
Instances It has been shown that computational complexity of instances in the 0-1 106 knapsack decision problem can be characterised in terms of a set of instance properties [27]. 107 These properties characterise the probability that an instance is solvable, that is, that 108 there exists a subset of items with total weight below the capacity constraint and total 109 value above the target profit. The solvability probability exhibits a phase transition [27], 110 which can be characterised in terms of the ratio of the normalised capacity constraint 111 (capacity constraint normalised by sum of all items weights) and the normalised target 112 profit (target profit normalised by sum of all item values). IC is then defined to be higher 113 the closer the instance is to the phase transition (see S1 Appendix for more information). 114 We made use of this property to select instances with high and low IC (see Methods and 115 S3 Appendix for more information). All instances in the experiment had 6 items.  The effect of instance complexity on performance In order to test whether 125 participants' ability to solve an instance was affected by its instance complexity (IC), 126 we compared performance on instances in the phase transition (high IC) with instances 127 outside the phase transition (low IC). Performance was significantly lower on instances in 128 the phase transition (P < 0.001, main effect of phase transition proximity on performance, 129 GLMM; Fig 3a; S1 Table Model 2). This suggests that IC affected participants' ability to 130 solve an instance. We further tested this relationship using a continuous parameterisation 131 of IC (see S4 Appendix). We found that this measure captures the negative effect of IC 132 on human computational performance (P < 0.001, main effect of continuous measure of 133 IC, GLMM; S4 Appendix).

134
Effect of solvability and tightness of constraints We hypothesised that perfor-135 mance would be affected by solvability of an instance, that is, whether the answer to the 136 decision problem was 'yes' or 'no'. In order to conclude that an instance is not solvable, 137 The number of solutions is defined as the number of item combinations that satisfy both capacity and profit constraints. Note: * p<0.1; * * p<0.05; * * * p<0.01; NS: not significant. every possible subset of items needs to be explored in order to determine that none of 138 the subsets satisfies the constraints. Conversely, in case of solvable instances, finding a 139 single subset of items is sufficient to determine that the instance is solvable. Such a set 140 may be identified without exploring the full search space and, additionally, there may be 141 more than one such subset. We investigated the effect of solvability and found that the 142 IC was still significant when controlling for solvability (P < 0.001, main effect of phase 143 transition on performance, GLMM; S1 Table Model 3), but that there was no significant 144 effect of solvability on performance (P = 0.355 main effect of solvability on performance, 145 P = 0.796 interaction effect of phase transition and solvability on performance, GLMM; 146 S1 Table Model 3).

147
For solvable instances, the tightness of the constraints of an instance can be studied 148 further by analysing the number of subsets of items that satisfy the constraints (Fig 3b,1c). 149 We found that for solvable instances, the probability of reaching the correct solution 150 increases as the number of subsets that satisfy the constraints increases (P = 0.001, main 151 effect of number of subsets on computational performance; GLMM; S1 Table Model 8

). 152
This suggests that participants were more likely to find a solution when there were more 153 possible solutions available. Moreover, this probability increased faster if the instance 154 was in the phase transition (P < 0.001, interaction effect of phase transition and number 155 of subsets on computational performance; GLMM; S1 Table Model 8). Furthermore, we 156 found that the mean number of solutions of solvable instances with high IC was lower 157 than for those with a low IC (P < 0.001, unpaired t-test). 158 We also hypothesised that performance would be affected by the tightness of the 159 profit and capacity constraints. We tested whether performance on instances in the over-160 constrained region was different to performance on instances in the under-constrained 161 region (both of which are outside the phase transition region and thus have low IC). We 162 found no significant difference in performance between the two regions (P = 0.355, main 163 effect of region, GLMM; S1 Table Model 7; Fig 1c), but confirmed a significant difference 164 in performance between the phase transition region and each of the other two regions 165 (P < 0.001, difference in performance between regions, GLMM; S1 Table Model 6). 166 Algorithm-specific complexity measures and performance So far, we have used 167 instance complexity measures that are independent of any particular solution algorithm 168 or strategy. That is, we have characterised instance complexity purely in terms of a small 169 set of instance properties. We now investigate whether participants' performance was 170 related to the computational resource requirements of two generic solution algorithms. 171 In particular, we tested whether human performance was related to the number of 172 computational operations these algorithms needed to perform in order to solve an 173 instance.

174
To perform this test, we considered two widely-used, generic solution algorithms, 175 Gecode [32] and Minisat + [33,34]. Gecode is a constraint-based solver that uses a 176 constraint propagation technique with different search methods, such as branch-and-177 bound. Minisat + , on the other hand, transforms the problem into a sequence of 178 satisfiability problems that are then solved using constraint propagation and backtracking. 179 For each of these solvers, we chose an output variable that indicates the difficulty for the 180 algorithm to find a solution and whose value is highly correlated with computational 181 time. For Minisat + we used the number of decisions and for Gecode we used the number 182 of propagations. Both metrics measure the search effort the respective solver had to make 183 to find the solution, which is related to the number of computational steps performed 184 and thus to computational time (see S2 Appendix). We did not use computational time 185 directly because for small size instances, like the ones used in this study, computational 186 time is highly confounded by time spent on reading in the instance, which is not the 187 case for the other variables we considered. 188 We found that performance in the instances was negatively related to the number 189 of propagations the Gecode algorithm used (P < 0.001, main effect of number of 190 propagations, GLMM; S1 Table Model   Instances To generate instances for the task, a sampling process similar to the one 204 for the Knapsack Decision Task was used (see the Methods section and S3 Appendix for 205 more information). The IC of the optimisation instances was defined according to the 206 IC of the corresponding decision problem at the solution (see S1 Appendix).

207
Summary statistics We excluded 2 trials (from 2 participant) because solutions were 208 submitted after less than 1 second into the task. In the analysis of submission times, 3 209 participants were excluded because they never submitted a solution before the time-out, 210 suggesting that they did not understand the submission instructions.
We define computational performance as a dichotomous variable that is equal to 1 if the 213 participant obtained a value equal to the maximum value obtainable in the instance, 214 and 0 otherwise. Mean computational performance was 83.2% (min = 0.67, max = 0.94, 215 SD = 0.08). Participants spent 43.5 seconds on average on an instance (min = 27.4, 216 max = 60.0, SD = 8.9). Participants were allowed to select any set of items, irrespective 217 of the capacity constraint, which implied that they had to ensure that their candidate 218 solution met the capacity constraint. The capacity constraint was only violated in 3% of 219 instances. Performance did not change throughout the task (P = 0.683, main effect of 220 trial number on performance, GLMM; S2 Table Model   performance.

224
The relation between instance complexity and performance We hypothesised 225 that computational performance in instances in the phase transition would be lower 226 than in instances outside the phase transition. We found that mean computational 227 performance was lower in those instances whose solutions have a corresponding decision 228 problem in the phase transition, relative to those instances whose solutions have a 229  Table Model 2).

231
So far, we have defined computational performance as a dichotomous variable. We 232 now look at a finer-grained measure. To this end, we define item performance as the 233 minimum number of item replacements that are necessary to change a candidate solution 234 to the optimal solution. This includes both the removal of items that are not in the 235 optimal solution and the addition of items that are in the optimal solution (but not 236 part of the candidate solution). The higher the value of this measure, the further away 237 the candidate solution is from the optimum. We found that item performance thus 238 defined was lower, on average, in instances with high IC relative to instances with low 239 IC (P < 0.001, main effect of phase transition, LMM; S4 Table Model 2). 240 Another way of defining performance is in terms of value obtained in an instance. 241 We define economic performance as the ratio of the total value of items in the submitted 242 solution to the total value of items in the optimal solution. We found that economic 243  Table 274 Model 6).  Table Model 3). Taken together with previous results, it 281 appears that the relation between effort and computational performance is moderated by 282 instance complexity. The fact that the probability of finding the optimal solution is lower 283 when participants spend more effort may have been caused by participants spending 284 more effort on instances with a high IC. This, however, suggests that participants are 285 somehow able to adjust their level of effort in response to instance complexity, which we 286 will return to in the Discussion.

287
In order to further examine the relationship between optimisation instances, effort 288 and IC, we examined the amount of time people spent after each click at each selection 289 of items before doing the next click. After each click participants were faced with 290 the question: "Is there another set of items with a higher profit that still satisfies the 291 weight capacity constraint?" We found that participants spent more time at those stages 292 in which there were fewer options that yielded a more valuable solution, whilst still 293 satisfying the capacity constraint (P < 0.001, main effect of the number of more valuable 294 solutions, LMM; S5 Table).
Relation between algorithm-specific complexity measures, effort and perfor-296 mance We next examined a set of alternative complexity measures based on the 297 generic solution algorithms Gecode and Minisat + . We found qualitatively similar results 298 to those of the knapsack decision problem, with higher instance difficulty, according to 299 Gecode propagations associated with lower average performance (P < 0.001, main effect 300 of number of propagations, GLMM; S2 Table Model 4). For the Minisat + number of 301 decisions this effect was not significant (P = 0.157, main effect of number of decisions, 302 GLMM; S2 Table Model 5). 303 We also examined whether these complexity measures were related to the time spent 304 on each of the instances. We found that, in line with previous results, instances with 305 higher Gecode propagations were associated with higher levels of effort (P < 0.001, main 306 effect of number of propagations, LMM; S3 Table Model 3). We found a similar relation 307 for the Minisat + decision measure (P = 0.001, main effect of number of decisions, LMM; 308 S3 Table Model 4). 309 We also analysed the relation between computational performance and Sahni-k, 310 another measure of instance complexity. Sahni-k is proportional to both the number of 311 computations and the amount of memory required to solve an instance of the Knapsack 312 Optimisation Task. This metric has previously been shown to be associated with 313 performance in the Knapsack Optimisation Task [15,30]. We found a negative relation 314 findings of a previous study [15]. However, when controlling for IC, the effect of Sahni-k 318 on effort is no longer significant (P = 0.580, main effect of Sahni-k, LMM; S3 Table 319 Model 7), in line with results reported above.

320
Relation between performance in knapsack tasks and cognitive 321 function 322 Finally, we investigated the relation between performance in two knapsack tasks and 323 various aspects of cognitive function. In particular, we used tests aimed at assessing 324 mental arithmetic, working memory, episodic memory, strategy use as well as processing 325 and psychomotor speed. Correlations between performance in these tasks and the 326 knapsack tasks were all non-significant (see Methods and S6 Current models of decision-making more often than not ignore the level of difficulty 329 of problems or treat it only informally [1][2][3]. We propose a generalisable framework 330 to quantify difficulty of a decision task based on the decision's inherent complexity. 331 The framework is based on instance complexity (IC) theory, a branch of computational 332 complexity theory, that relates properties of instances of a computational problem to 333 computational resource requirements. We tested the effect of IC on decision quality 334 in two variants of a canonical task, the decision and optimisation variants of the 0-1 335 knapsack problem. We also examined effort exerted in the optimisation variant of the 336 0-1 knapsack problem. We found that IC negatively affects decision quality in both 337 tasks. Moreover, we found that more effort was exerted on instances with higher IC.

338
The aim of IC theory is to characterise the relation between the number of computa-339 tional resources (time) required by an algorithm to solve an instance, and properties of 340 the instance. It has been shown for several decision problems (most of them NP-complete) 341 that the probability of an instance having a particular solution (yes/no) can be expressed 342 in terms of an order parameter that is based on a small number of instance properties. 343 Moreover, this probability exhibits a phase transition, that is, there exists a narrow range 344 of values of the order parameter within which the probability of a yes answer changes 345 from close to 0 to close to 1 [23][24][25][26][27]35]. It has been conjectured that solvability of all 346 NP-complete problems exhibits such a phase transition in terms of an order parameter 347 and that the hard instances, in terms of compute time, of those problems lie in the 348 proximity of the phase transition [23]. It was recently shown that a similar link between 349 hardness of instances and a phase transition in solvability exists for the 0-1 knapsack 350 problem [27]. We exploited this link in the present study. situations that involve solving difficult tasks, whereas complexity seeking could lead 378 to situations in which people seek tasks that require a high amount of effort to be 379 solved [36]. Another way that complexity could be related to behaviour is through 380 its effect on confidence. In the case of the Knapsack Optimisation Task it is still an 381 open question how participants chose when to submit their answer. The IC level could 382 influence the confidence on having found the solution, and in turn this confidence could 383 play a role in the decision of when to submit an answer. We leave it to future work to 384 explore the effects of attitudes towards or preferences for complexity in decision-making, 385 as well as the relation between IC, confidence and behaviour.
Which algorithms did participants use? In addition to analysing IC as a measure 387 of complexity, we investigated other complexity measures that are related more explicitly 388 to the number of computational steps (time) required by an electronic computer to 389 solve an instance. We found that one of the two algorithm-specific complexity measures 390 we considered correlated with both human performance and effort exerted. This is 391 probably related to the main features of each of the algorithms. It is unlikely that 392 humans reformulate the problem as a boolean satisfiability problem in order to reach 393 a solution (MiniSat + ). It is more likely that they compute directly on the problem 394 itself as a directed search based on the constraints (Gecode). These results suggest that 395 the computational mechanisms that humans use might be similar in nature to those 396 of particular computer algorithms, a notion that should be explored in more detail by 397 future research.

398
The relation between decision and optimisation tasks Although the knapsack 399 optimisation and decision problems are two fundamentally different types of computa-400 tional problems, they are related to each other at a theoretical level. Specifically, the 401 optimisation problem can be solved by the iterative solution of a series of corresponding 402 decision problems. Based on this link, we defined IC for the optimisation problem and 403 found a lower performance on instances with higher IC, thus mirroring the decision 404 problem results. This is further evidence in support of our theoretical framework. We 405 also found that participants who performed better in the decision task tended to perform 406 better in the optimisation task. The latter finding suggests that individual constraints 407 that affected performance were active in both tasks.

408
The relation between IC and effort exerted One interesting finding is that effort 409 exerted on an instance was adjusted according to IC. This result is perplexing. In order 410 to know which resources a computer algorithm needs to solve an instance, it is necessary 411 for the algorithm to find the solution. That is, a computer algorithm can only compute 412 resource requirements of an instance ex post. In contrast, we found that participants 413 adjusted their effort to IC even without being able to find the solution at all. This 414 result is consistent with the findings of a previous study that used a different measure of 415 instance complexity [15].
It is an open question which mechanisms participants used to adjust effort. It has 417 recently been suggested that the brain allocates resources to tasks according to the 418 expected benefits and expected costs, in particular cognitive resource requirements, 419 related to the task [16,[37][38][39]. These accounts also suggest that decision-makers learn 420 to estimate costs and benefits of a task based on a set of task features [17][18][19]. These 421 accounts, however, do not specify what these features might be. In fact, selection of these 422 features might be in itself an NP-hard problem. It is conceivable that decision-makers 423 use IC to estimate the expected costs of performing a task. This would require that 424 decision-makers can somehow detect IC [1]. Future research should investigate possible 425 mechanisms of detecting IC.

426
Performance in the knapsack tasks and basic cognitive abilities Individual 427 differences in performance in the knapsack tasks were independent of individual dif-428 ferences in the set of core cognitive abilities including attention, working memory and 429 mental arithmetic. One possible explanation for the lack of correlation is that these 430 cognitive abilities play only a minor role in solving computationally hard problems 431 and that those problems instead require another cognitive ability that is not captured 432 by any of the tests we administered. Another possible explanation is that we did not 433 measure the active cognitive constraint that drove differences in individual performance. 434 One candidate for such a constraint is memory [40,41]. It is, of course, also possible 435 that our study did not have sufficient statistical power to detect individual differences. 436 Further research is needed in order to incorporate the full spectrum of cognitive resource 437 limitations and link them to performance and effort in decision tasks [1]. is an open research question in computer science [42]. Further research would be required 444 to characterise the probability distribution of knapsack instances found outside of the 445 laboratory setting.
Furthermore, in our study, the task involved finding the optimal solution. However, 447 finding the exact solution might not always be required in the real-world. In many 448 cases finding an approximate solution might suffice. However, for many NP-complete 449 and NP-hard problems, approximating the solution is as hard as finding the optimal 450 solution [20,43]. It is still an open theoretical question whether IC can be extended to 451 approximation problems. Future research should investigate whether the results found in 452 this study, for both humans and computers, can be extended to approximation instances. 453 The Church-Turing thesis A core notion in the theory of computation is the  Turing thesis. The thesis states that the universal Turing machine is a general model 455 of computation, which implies that any input/output operation that can be performed 456 by a human computer, can also be performed by the universal Turing machine [44][45][46]. 457 Our findings support a related notion: that an algorithm that requires a larger number 458 of computational resources (time) on a universal Turing machine (here, an electronic 459 computer) also requires relatively more computational resources in the human brain. 460 Thus, our findings strongly suggest that computational tasks have inherent complexity, 461 that is, the amount of computational resources required to solve them is independent of 462 the particular computational model used. The framework we present in this paper is a 463 candidate for the quantification of inherent complexity of decision tasks.

464
Implications for decision theory and public policy Many theories of decision-465 making (including meta-decision-making) assume that people optimise [4-7, 9-11, 16, 466 18, 38, 47]. Our results are consistent with previous results that show that this is often 467 not the case [7,48]. We show that performance is dependent on task complexity, thus 468 corroborating previous studies that highlight the relevance of incorporating cognitive 469 resource requirements and limitations into decision theory [1,15,49]. In addition, 470 our approach allows for a generalisable and formal quantification of those resource 471 requirements in decision and optimisation tasks.

472
In a broader context, the present study might help to identify the limits of human 473 cognition and decision-making. This is crucial for the design of policies that wish to 474 improve the quality of decisions such as financial investments, selection of insurance 475 contracts, among many others. In those cases where the task is too demanding, mech-476 anisms could be designed to help people improve the quality of their decisions. This 477 could be done, for instance, through software applications that take advantage of the 478 computational power of electronic computers. Finally, our results advocate for closer 479 collaboration between decision scientists and computer scientists. Not only can decision 480 sciences be informed by computation theory, as done in this study, but research on 481 humans could motivate the development of new theories and algorithms.  Item values, in dollars, were displayed using dollar bills and weights, in grams, were 501 shown inside a black weight symbol. The larger the value of an item, the larger the dollar 502 bill was in size. Similarly, the larger the weight of an item, the larger its weight symbol 503 was in size. At the centre of the screen, a green circle indicated the time remaining in 504 this stage. In the second stage (22 s), target profit and capacity constraint were added 505 to the screen inside the green timer circle. In the third stage (2 s), participants saw a 506 'YES' or 'NO' buttons on the screen, in addition to the timer circle, and made a response 507 using the keyboard (Fig 2a). A fixation cross was then shown (5 s) before the start of 508 the next trial.  properties [27] (Fig 1a). In particular, IC can be characterised in terms of the ratio of 516 the normalised capacity constraint (capacity constraint normalised by sum of all items 517 weights) and the normalised target profit (target profit normalised by sum of all item 518 values) (see S1 Appendix for more information). We made use of this property to select 519 instances for the task (see S3 Appendix for more information). Throughout we ensured that no weight/value combinations were sampled twice. In order 526 to also ensure enough variability between instances in the phase transition we added 527 an additional constraint in the sampling from each bin. We forced half of the instances 528 selected in each bin in the phase transition to be easier than the median according to an 529 algorithm specific ex-post complexity measure (Gecode propagations parameter) and the 530 other half to be harder than the median (Fig 1b). The order of presentation of instances 531 in the task was randomised for each participant.

532
Knapsack Optimisation Task Task structure In this task, participants were asked to solve a number of instances of 534 the (0-1) knapsack optimisation problem. In each trial, they were shown a set of items 535 with different weights and values as well as a capacity constraint. Participants had to 536 find the subset of items that maximises total value subject to the capacity constraint. 537 This means that while in the knapsack decision problem, participants only needed to 538 determine whether a solution exists, in the knapsack optimisation problem, they also 539 needed to determine the nature of the solutions (items in the optimal knapsack).

540
The task had two stages. In the first stage (60 s), the items were presented together 541 with the capacity constraint and the timing indicator. Items were presented like in the 542 Knapsack Decision Task. During this stage, participants were able to add and remove 543 items to/from the knapsack by clicking on the items. An item added to the knapsack was 544 indicated by a light around it (Fig 2b). Participants submitted their solution by pressing 545 the button 'D' on the keyboard before the time limit was reached. If participants did not 546 submit within the time limit, the items selected at the end of the trial were automatically 547 submitted as the solution. Participants were then shown a fixation cross (10 s) before 548 the start of the next trial.  Instances To generate instances for the task, a sampling process similar to the one 553 for the Knapsack Decision Task was used (see S3 Appendix for more information). 554 We selected the same normalised capacity bin as for the Knapsack Decision Task (0.4-555 0.45) and selected the normalised profit of the solution such that the corresponding 556 decision problem (see S1 Appendix) lied in the phase transition (0.6-0.65) or in the 557 over-constrained region (0.85-0.9). Again, we forced half of the instances selected in each 558 of the bins in the phase transition to be easier than the median, according to the Gecode 559 propagations measure, and the other half to be harder than the median. We sampled a 560 total of 18 instances, 12 in the phase transition and 6 out of the phase transition. The 561 order of presentation of instances in the task was randomised for each participant.

Mental arithmetic task
In this task, participants were presented with 33 mental arithmetic problems [50]. The 564 first three trials were considered test trials and thus were not included in the analysis. 565 They were given 13 seconds to solve each problem. The task involved addition and 566 division of numbers, as well as questions in which they were asked to round to the nearest 567 integer the result of an addition or division operation.

568
Basic cognitive function tasks 569 In addition, we also tested participants' performance on four aspects of cognitive 570 function that we considered relevant for the knapsack tasks, namely, working memory, 571 episodic memory, strategy use as well as processing and psychomotor speed. After reading the plain language statement and providing informed consent, participants 577 were instructed in each of the tasks and completed a practice session for each task. 578 Participants first solved the CANTAB RTI task, followed by the Knapsack Decision 579 Task. Then they completed the CANTAB RTI task again, followed by the Knapsack 580 Optimisation Task. Subsequently, they completed the other CANTAB tasks, in the 581 following order: PAL, SWM and SSP. Finally, they performed the mental arithmetic task 582 and completed a set of demographic and debriefing questionnaires. Each experimental 583 session lasted around two hours.

584
The Knapsack Decision Task, Knapsack Optimisation Task and mental arithmetic 585 task were programmed in Unity3D [52] and administered on a laptop. The CANTAB 586 tasks were administered on a tablet. The R programming language was used to analyse the behavioural data. Python (version 592 3.6) was used to sample instances and run the simulations.

593
All of the generalised logistic mixed models (GLMM) and linear mixed models (LMM) 594 included random effects on intercept for participants. Their p-values were calculated 595 using a two-tailed Wald test. All statistical analyses were done in R [53] and mixed 596 models were estimated using the R package lme4 [54].

598
The raw behavioural data, the data analysis code and the computational simulations are 599 all available from the Open Science Framework.

600
The Knapsack Decision Task, Knapsack Optimisation Task and mental arithmetic 601 task were programmed in Unity3D [52] and are available as well from the Open Science 602 Framework.