Ozgun, Onur with Yaman Barlas, "Systemic Complexity of a Growth Management Game: Comparative Analysis of Decision Heuristics and Experimental Results", 2012 July 22-2012 July 26

Online content

Fullscreen
The 30th International Conference of the System Dynamics Society, St. Gallen, Switzerland. July 22 ~ 26, 2012.

Systemic Complexity of a Growth
Management Game: Comparative Analysis
of Decision Heuristics and Experimental
Results :

Onur Ozgiin and Yaman Barlas
Bogazigi University
Department of Industrial Engineering, 34342 Bebek, Istanbul, Turkey
Tel: +90 212 3597343, Fax: +90 212 2651800

E-mail: onur.ozgun@boun.edu.tr, ybarlas@boun.edu.tr

Abstract

In this study, using different versions of a growth management game involving two
different complexity factors, we compare performances of heuristic rules with experi-
mental results. We present a method for obtaining a statistical distribution of scores
resulting from a given simulated decision heuristic, which can be used to compare
against and assess experimental gaming results. The method is based on the idea of
generating vast number of scores by stochastically simulating a given decision rule and
obtaining the resulting score distribution. We use this method to compare scores from
different game versions whose scores are essentially not comparable, and to see how
the score distributions change from one game version to another. In simulations,
we first use a simple random “decision rule” and then develop a more intelligent
hill-climbing heuristic. The results show that when the games involve delay, human
subjects do not perform better than the random heuristic —a primitive rule composed
of a sequence of random decisions. On the other hand, in nonlinear games, subjects
outperform the random heuristic and their scores fit better the score distribution of the
hill-climbing heuristic. We also demonstrate how the score distribution from random
heuristic can be used as a reference performance measure.

Keywords: growth management, systemic complexity, decision heuristics, simu-
lation games, gaming experiments

1 Introduction

Simulation games are widely used as a research tool in the field of dynamic decision-making
because they provide a simple controlled environment for testing hypotheses on human
psychology and behavior. Tested hypotheses are often on the effect of a factor on the
specified independent variable. These factors can be related to (1) the underlying model
such as delay or strength of feedback, (2) the simulator characteristics such as decision
interval or transparency, or (3) the player characteristics such as mental model or cognitive
style (Rouwette et al., 2004). In most of the studies, the independent variable is the task
performance; often measured in terms of total cost (Trees et al., 1996; Diehl and Sterman,
1995), cumulative profit (GréBler et al., 2000; Yang, 1996), market growth (Bakken, 1993)
or deviation from some benchmark (Paich and Sterman, 1993; Moxnes and Saysel, 2009).

1Research supported by Bogazici University Research Grant no 09HA301D
With the purpose of assessing the influences of systemic complexity factors on the overall
complexity of a simulation game, we designed a growth management game and its different
versions involving nonlinearity, delay and feedback. The objective of the experiment was
to determine a strength level for each factor where the factor becomes effective on game
complexity. One of the measures of game complexity was subject’ performances, measured
by the amount of growth they achieved. During the analysis, we realized that it is very
difficult to find an objective measure of growth that works for all game versions. Although
the performance measure we used (normalized cumulative profit) was an adequate measure
for comparing scores coming from different levels of a factor (which was our objective), we
wanted to develop a method for comparing scores coming from games involving different
factors.

To illustrate the difficulty of comparing scores from different game versions, suppose that
the possible scores in a game are between 0 and 100. Also suppose that introducing delay
to the game changes this possible range to 0-500. In that case, the analyst should rescale
the scores from these two versions to make them comparable. The rescaling problem may
be relatively easy if the analyst can determine some constants in the score distribution, like
the minimum and the maximum possible scores. Unfortunately, it is often very difficult,
if not impossible, to determine such constant references. In rescaling the scores, instead
of constant references, it is possible to use the some reference scores such as do-nothing
strategy or a standard heuristic. However, applying a fixed strategy does to guarantee a
fixed score in the score distribution. For example, the do-nothing strategy might yield the
sion whereas it might correspond to an average performance in
another version. Considering our earlier example, assume that the scores from the base
game are symmetrically distributed in the range of 0-100. A score of 50 would reflect
an average performance in this version. Now assume that the scores of the game version
involving delay are strongly left-skewed, i.e. the relative probability of having low scores
is higher than the probability of having high scores. In this case, a score of 250 would

minimum score in one v

correspond to a very good performance despite being located in the mid-point of the range.

We tackle the problem of rescaling the game scores by making use of the statistical distri-
bution of possible scores. If we know the distribution of possible scores obtainable from the
game, we can tell the probability of acquiring a particular score by chance. In this way, we

can as: a particular performance, regardless of the game structure variations.

To obtain the true distribution of all possible scores, in theory, we can enumerate all possible
scores, but it is practically impossible for any realistic simulation game. Fortunately, a
random sample of all possible scenarios can be used to obtain a reasonably good estimation
of the score distribution. Selecting a random sample from all the possible scenarios is
essentially generating a random decision in each decision period of the game and running
the game with these decisions. We call this decision rule the random decision heuristic.

It is also to possible to utilize other decision heuristics instead of the random decision heuris-
tic. In this case, the resulting distribution would be the distribution of scores obtainable
by applying this particular decision heuristic. An important step of testing effectiveness
of decision rules is comparing experimental results with the decision rule (Sterman, 1987).
While comparing the behavior of the rule with the behavior of real subjects is a necessary
part of this test (Barlas and Ozevin, 2004; Diehl and Sterman, 1995; Paich and Sterman,
1993), one can further assess the effectiveness of the decision rule by comparing the score
distribution coming from a decision heuristic with experimental results (see Kampmann,
1992, for an example).
In this paper, we describe the growth management game and its different versions involving
complexity factors. We illustrate how we obtain the random score distributions for different
game versions. Next, we compare score distributions for different game versions with
each other and with the players’ performances. Finally, we design a hill-climbing decision
heuristic and compare its performance with random heuristic and players’ performances.

2 A Growth Management Game

We designed a growth management game for testing the effects of complexity factors on
the overall complexity of the game. The base version of the growth management game does
not have any dynamic complexity element, and nonlinearity, delay and feedback (naturally
with a stock) is introduced one by one to the game. The factor levels are changed in the
experiments by changing the delay order, delay duration, shape of nonlinear functions and
gain of feedback loop in the simulator. We selected several levels for each complexity factor:
eight for delay time and feedback strength; four for delay order and nonlinearity. A total of
20 subjects is used. There are three player groups composed of eight players: delay group,
nonlinearity group and feedback group. The players play the simplest base in the beginning
and at the end. In between, they play eight games involving one of the complexity factors
in different levels. We use a modified version of Latin square design so that the effect of
playing order is minimized.

The details of the experiments and analysis of the results are out of the scope of this paper,
they can be found in Ozgiin and Barlas (2011). In this paper, we focus on the effects of
complexity factors on the score distribution, and comparison of experimental results with
decision heuristics. Due to space limitations, we exclude the feedback factor, however, the
method described here is applicable to any factor.

2.1 Base Version of the Growth Management Game

The players play the role of a product manager for a certain brand of lotion of a cosmetics
company. The time unit of the model is weeks and the time horizon is 40 weeks. The
calculation step, dt, is taken to be one week. Players can give two types of decisions: price
p, and advertising minutes aired on radio per week a. Sales S, is directly proportional to
advertising and inversely proportional to price. The sales amount determines the revenue
R, and the revenue determines the weekly profit JZ. The players’ aim is to increase their
cumulative profit as much as possible, in a sustainable way.

The game structure is as shown in Figure 1. The equations of the game are in Appendix A.
Since the base game will constitute a reference point, which the results of other games
are compared against, it is kept as simple as possible. The base game does not include
any stock. Thus, essentially it is not a dynamic game, but a simple trial-and-error task of
figuring out the price—advertising combination yielding a high profit.

The game is initially at equilibrium. The players are not allowed to change the initial condi-
tions in the first week. Starting from the second week, players give two decisions: price and
advertising. They can see the behavior of their weekly profit and the benchmark behavior.
The benchmark behavior is defined such that long-run profit is maximized. For finding
the benchmark, the weekly profits corresponding to all price-advertising combinations are

3
Advertising Price

Normal Advertising

Aiplps)

Weekly Profit

II

Figure 1: The base structure of the Growth Management Game.

calculated and the combination yielding the maximum weekly profit is applied starting
from the second week. In the base game, it is impossible to exceed the weekly benchmark
profit, hence, the cumulative benchmark profit. Therefore, the benchmark provides a strict
upper-limit.

2.2 Nonlinearity Factor

When nonlinearity is added to the base model, effects of price and advertising on sales are
made nonlinear, simultaneously, as shown in Figures 2(a) and 2(b). There are four levels
of nonlinearity: mild (denoted by N1), moderate (N2), high (N3) and extreme (N4). Note
that similar to the base game, the nonlinear games do not include any kind of stock. Thus,
the players give 40 independent decisions when playing these game versions.

Introducing nonlinearity changes the size and location of the region where the player can
obtain high profits (See Figure 3). Similar to the base game, the benchmark behavior for
the nonlinear game is the optimum behavior, which is found by the same method applied
in the base game.

2.3. Delay Factor

Delay is introduced in the form of information delays between decisions and their effects
on sales. Since delay structures include stocks, this version of the game has a dynamic
component, unlike the base game and the nonlinear games. The immediate effect of an
action on weekly profit is positive if price is increased or advertising is decreased. For
example, consider the case of a price increase: since sales will be negatively affected form
this price increase after a delay, the player will enjoy a temporary weekly profit rise until
Effect of Price
Effect of Advertising

P/Po
(a) f(-): Effect of price (b) g(-): Effect of advertising

Figure 2: Levels of nonlinear effect functions. From light to dark: mild, moderate, high

and extreme nonlinearity versions.

(a) Base game (b) High Nonlinearity setting (N3)

Figure 3: The surface plots showing weekly profits resulting from all combinations of price

and advertising.

sales starts to fall. If no other subsequent action is taken, weekly profit will eventually
come to the level that it would come if there were not any delay. Delay is analyzed in two
components: order of delay and delay duration. Order of delay has four levels while delay
duration has eight levels. There are 4 x 8 = 32 possible combinations of all levels of these
two variables. The game versions involving delay are given in Table 1.

Table 1: The versions of the game involving delay.
Delay Duration
Delay Order | 2wk 4wk —6wk = 7wk —8wk _9wk 10 wk __11. wk

t Order | O1T2  O1T4 OITG Or O1T10. OITIT
Third Order O3T4 = O3T6 O3T9 =O3T10 O3TIL
Fifth Order O5T6 O5T9 O5TIO OSTIL
Discrete ObT2 ODT4 ODT6 OpT9 ODTIO ODT

on
The long-term equilibrium point of the profit is not affected by the existence of delay. Thus,
the benchmark behavior is found by setting the price and advertising to a combination that
maximizes the profit in the long-term, starting from the second week. In the existence of
delay, weekly profit first decreases and then gradually reaches a higher level. Figure 4 shows
the benchmark behaviors for the base game and games involving nonlinearity and delay.

6000 6000 000
5000 5000 f ras 5000
= 4000 4000} |
é é i
Ssoof © sooo ||
3 i gli
= nooo} | E 2000} {

1000 +000

°o 6 10 15 20 25 30 35 40 % 5 10 15 «20 25085 40 oO 6 10 15 20 2 30 35 40

“rao (weeks) Timo (weeks) “eno (weeks)
(a) Base game (b) Nonlinear game (setting N3) (c) Delay game (setting O58)

Figure 4: Benchmark behaviors for the base, nonlinear and delay game versions.

The benchmark behaviors of game versions involving delay maximize the long-term sus-
tainable profit. The cumulative profit of a player within the limited time horizon of a delay
game can be potentially higher than cumulative benchmark profit. Therefore, unlike the
base game and the nonlinear games, the benchmark behavior of a delay game is not the
optimum behavior, although it is found by applying the same strategy. Due to the dynamic
component of the game involving delay, it is not possible to calculate the optimum behavior
that yields maximum possible cumulative profit.

3 Obtaining Random Heuristic Score Distributions

For obtaining a distribution of the scores for the random decision heuristic, we generate a
sequence random price and advertising decisions for 40 weeks. Then, we simulate the game
with these random decisions and record the resulting cumulative profit. This cumulative
profit represents the performance score of a simulated player and constitutes one data
point. We repeat these steps many times and obtain vast number of scores. The histogram
of these scores shows the distribution of possible scores.

In the growth management game, there are 40 periods and roughly 33 x 28 = 924 possible
decisions at each period. This makes 40°! possible decision sequences, each of which
yielding a different cumulative profit. From this pool of all possible sequences of decisions,
we took a sample size of 10°. Generating the random decisions and running the model for
10° simulated players took about two hours on a powerful computer.

Remember that in the base game and the nonlinear games, the results of consecutive
decisions are independent (because these games do not have a stock). Since cumulative
profit is the sum of 40 independent weekly profits, the statistical distribution of cumulative
profits should be normal (Gaussian) by the Central Limit Theorem.

Figure 5 shows the resulting distribution of cumulative profits for the base game, obtained
by running the random heuristic 10° times. As expected, the distribution fits the normal
distribution with a mean of 40,264 and a standard deviation of 5020. The theoretical values
of the mean and the standard deviation calculated using the Central Limit Theorem are
40,256 and 5020, respectively. Figure 5 also shows the locations of two fixed strategi
do-nothing strategy and benchmark. The cumulative profit of the do-nothing strategy is
40,000. The proximity of do-nothing score and mean of random-heuristic scores is purely
coincidental. Benchmark’s cumulative profit is 113,634, which is calculated as the sum
of weekly profits shown in Figure 4(a). Note that, the benchmark score (which is the
maximum possible score in the base game) is far beyond the simulated scores of random
heuristic. Indeed, the benchmark is 14.6-0 away from the mean of the distribution!

Do-nothing |
Score
8e-05 4
Benchmark
esos Score
4e-05 4
20-05 4
o4 _d — :
1 1 T 1 1 1 1
0 20000 40000 60000 80000 100000 120000

Figure 5: Probability density of random scores (cumulative profits) for the base game,
estimated by histogram and kernel estimator.

r
| Do-nothing I
cial Score |
i
i ny
i :
36-05 | Benchmark
| Score
i
!
i
i
2e-05 4 i
i
|
16-05 4 '
i
i
!
i
| iN
04 t !
T T T T 1
0 50000 100000 150000 200000

Figure 6: Probability density of random scores (cumulative profits) for the game involving
ing N3).

We repeat the same procedure for the nonlinear games. Note that the introduction of
nonlinearity increases the maximum possible weekly profit obtainable (see Figure 4(b)).
This effect is expected to increase the standard deviation of the cumulative profits. On top
of that, introducing nonlinearity also changes the shape of the weekly profit surface (see
Figure 3(b)). This effect shifts the distribution of the cumulative profits. Figure 6 shows the
resulting distribution of the random heuristic scores for the game involving nonlinearity.
(Note the scale difference between Figure 5 and Figure 6.) The standard deviation of
nonlinear games’ scores is 2.25 times higher with respect to the base game: from 5020 to
11,319. The mean is shifted to 77,122. Observe that the do-nothing score is now much
below the median of the score distribution. Moreover, benchmark score (still the maximum
possible score) is now 11.6-o away from the mean (as compared to 14.6-c).

Figure 7 shows the cumulative profit distribution for a game version involving delay (Setting
O5T6: fifth order 6-wk delay). Although the Central Limit Theorem would not apply here,
Kolmogorov-Smirnoy test for normality yields a very low p-value. The most striking change
brought by the introduction of delay is the relative lowering of the benchmark cumulative
profit. Although the benchmark is calculated by applying same strategy in all games,
unlike the base and nonlinear games, the benchmark in the delay case is not the maximum
possible score. Nonetheless, it is 5.7-0 away from the mean, meaning that it still represents
a fairly good strategy.

Do-nothing
6e-05 4 Score

1, Benchmark
Score

i i T T — T
0 20000 40000 60000 80000 100000 120000

Figure 7: Probability density of random scores (cumulative profits) for the game involving
6-wk fifth order delay (setting O5T6).

4 Comparing Random Score Distributions with Ex-
perimental Results

First, we compare the random scores for the base game with the players’ scores (Figure 8).
We have 24 data points for the base game. The players play the base game at the beginning
and at the end of the experiment session. In between, they play nine games involving
complexity factors. Overall, the players’ scores are much better than the random scores.
Most experiment scores are even higher the best random score obtained in the simulations.
This is not much surprising because the random heuristic represent a completely random
decision process. The intelligent decision-making processes of the players are expected to
beat an unintelligent random heuristic.

Figure 9 compares random scores with players’ cumulative profits for a game version in-
volving high nonlinearity. We have eight experiment data for each nonlinear game version.
All 32 of the experimental scores are higher than the maximum of random heuristic scores
(See Figure 19 in Appendix B). The densities of nonlinear versions are similar to the base
game’s density. The results indicate nonlinearity does not bring any significant complex:
to the game.

Do-nothing |
Score 1
8e-05 4
H Benchmark
6e-05 4 Density of ' Score
Random —>} |
Scores
i
4e-05 4 '
i
f
\ Density of
! Players’
2e-05 4 Scores\
oe
i
j
Ob estar es t ome em + ee om aoe}

T T T T T T
0 20000 40000 60000 80000 100000 120000

Figure 8: Distribution of the random scores for the base game (grey area) versus players’
scores (individual dots) and their density (dashed line). Black dots represent scores from
the first trial, the grey dots represent scores from the last trial.

Figure 10 compares random scores with experiment scores for a game involving delay.
The figures for other delay games are in Appendix B. Introduction of delay brings a
considerable difference. The players’ scores are much lower compared to no-delay versions.
The distinction between players’ scores and simulated scores seems to vanish. This is an
important observation because it indicates that delay makes is very difficult to carry out
a sensible decision making process, giving rise to scores close to the results of a totally
unintelligent decision making rule. In the growth management game, delay not only brings
a difficulty in controlling the game but it makes difficult to discover a good strategy. Unlike
a game in which players control a single variable, in the growth management game players
have to find out which strategy would yield a good profit in a two-dimensional price—
advertising space. This trial-and-error process takes long time due to delay exists and
leads to bad performance scor

Since now we have the distribution for all game versions, we can compare the scores coming
from different game versions with each other. To do this, we define two performance
measures.

9
r
T
4e-05 7 Do-nothing |
Score
i
f Density of
3e-05 4 I Players’
i Scores
i
| Density of
| Random
2e-05 4 I Scores
-
i
|
-05 4 i a
Tes ' Benchmark
| / Score
| /
04 - <= 0 0 owe
r r r r r
0 50000 100000 150000 200000

Figure 9: Distribution of the random scores for a nonlinear game — setting N3 (grey area)
versus players’ scores (individual dots) and their density (dashed line).

7
| Density of
Do-nothing Random
6e-05 4 Score i Scores
'
i
5e-05 + i
' Benchmark
4605 | Score
|
3e-05 4 !
Density of a os
Players’ |
28:05 Scores H
i
i
1¢-05 | \
' -
07 — i t a
r r t r r r r
0 20000 40000 60000 80000 100000 120000

Figure 10: Distribution of the random scores for a delay game — setting O5T6 (grey area)
versus players’ scores (individual dots) and their density (dashed line).

First performance measure is standardized cumulative profit. It standardizes the player
scores with respect to means and variances of corresponding game version’s random score
distributions, as follows:

Cumulative profit — Random heuristic’s mean cumulative profit

(1)

Standard deviation of random heuristic’s cumulative profits

This performance measure makes use of the fact that the random score distributions follow
Normal distribution. It shows how many standard deviations away is a given score from
the mean of random scores.

10
A second performance measure is called probabilistic score and is defined as;
P{X <2} (2)

where is x is the score to be evaluated and X is a random variable coming from the density
of random scores. This performance measure changes between 0 and 1 and shows the
probability of a random score being less than the score x. Note that it does not make any
assumption about the distribution of the random scores. In general, we have two options
to calculate this probability: either (1) empirically by calculating the frequency of random
scores lower than «, or (2) by fitting a known probability distribution to the simulated data
and calculating the probability using this distribution. In the growth management game,
many experiment scores of the base and nonlinear games are higher than scores coming
from random heuristic. Therefore, the first option is not feasible for these game versions.
Fortunately, we know that for the base and nonlinear games, the theoretical distribution
of cumulative profits is normal. Thus, we can estimate the mean and standard deviation
from the sample and use the second option. For the delay version, since real data points
fall in the range of random heuristic score distribution, we can use the first option.

Figure 11 compares scores coming experiments of the base game and a delay game using
three performance measures. The first performance measure is the usual cumulative profit
(Figure 11(a)). Since the scales of two games are different, it is not very clear whether
a particular data point coming from one version is better than another performance in
the other version. The second performance measure is normalized cumulative profit. It is
defined as:

Cumulative profit — Cumulative profit of do-nothing strategy

Cumulative benchmark profit — Cumulative profit of do-nothing strategy (3)
This performance measure attempts to solve the scale problem by normalizing the scores
based on two fixed strategies: do-nothing strategy and benchmark strategy (Figure 11(b)).
This is a simple normalization yielding conclusions consistent with the nature of the prob-
lem, especially when comparing game versions with similar structures. However, it ignores
the possibility that these two reference points may not be at the same difficulty level for
different game versions. Standardized cumulative profit takes into account the difficulty
of obtaining a score by measuring its distance to the mean of random scores in terms of
standard deviations. As seen from Figure 11(c), standardized cumulative profit reveals the
fact that with respect to the random scores, the players performed much better in the base
game compared to the delay game. Note that, although standardized cumulative profit con-
siders the relative position of player scores within the random score distribution, it ignores
the fact that the probabilities drop as we move away from the mean. Probabilistic score
calculates the probability of obtaining a score by applying the random heuristic. Based
on this measure, we see that (Figure 11(d)) most of the scores obtained by players in the
base game are almost impossible to obtain by the random heuristic (hence densely located
around 1.0) although they seem to be scattered in a wide range in terms of first two perfor-
mance measures. This measure indicates a stronger difference between the scores of delay
games and base games is stronger than indicated by the other measures.

Despite its theoretical appeal, the probabilistic score yields results that are difficult to
interpret for the growth management game. Data points are mostly at either extremes,
which reduces the reliability of the probabilistic score as a performance measure for the
growth management game.

11
104
140000 4 b z
i a oy £
> .
90000 4 =
a 2 os] a
= ry
2
a : i . :
ra
2 70000 4 5 oa
F : 3
E . 3 .
2 . 2 .
E oz .
5
0000 4 . . 8 .
ee oo 4
30000 |
024
Base Games  O5T6 Games Base Games  O5T6 Games
(a) Cumulative profit (b) Normalized cumulative profit
” YL ~ 1.0 -} seansseecersensnseeece
124 t
: os 4
& 04 :
a <
$ 2
3 : 8 064
2 6] : rf
S cd
3 . 2
g ; g
33] . 8 044
5 . &
3
4
g .
Cr
024
| 3
— 004 . eer eee
T 1 1 1
Base Games  O5T6 Games Base Games  O5T6 Games
(c) Standardized cumulative profit (d) Probabilistic score

game and for a delay game using four
show the benchmark and the do-nothing

Figure 11: Distribution of scores of for the ba
performance measures. The upper and lower lines s
scores, respectively.

5 Hill-Climbing Heuristics

We saw that the player scores are usually higher than the random scores. In this section,
we propose a decision heuristic that is expected to better represent the decision process of

12
players. We then simulate the heuristic and compare the resulting score distribution with
experimental results to assess its performance.

Since this is a growth management task, a hill-climbing heuristic is appropriate. It is
widely used as an optimization heuristic (Sterman, 2000). It is based on the idea that if
you continue to move in the direction with steepest ascent, you will reach the maximum
point, as long as you are not caught in a local optimum. Our algorithm extends the idea of
hill-climbing by implementing it on a two-dimensional price—advertising space. Moreover,
it only uses the information that is available to a player.

5.1 Hill-Climbing for Base and Nonlinear Game Versions

The hill-climbing algorithm proceeds as follows. First, it takes two initial steps: one for
price, one for advertising, in random order, with arbitrary directions and step sizes. Each
of these first two steps involves movement only in one of two variables. In this way, the
algorithm determines whether a direction is beneficial and how much profit change it brings.
Since there is no delay in these games, the algorithm can immediately assess the outcome
of the decisions. Based on the first two steps, the algorithm determines the most promising
direction by taking a weighted linear combination of price and advertising directions with
positive profit gains (These are called two base directions: b, and b2). The weights are based
on the profit gains per unit change in the variables. Then, it advances in that direction
(d*) with a random step size. To add a further randomness, the movement is not precisely
in the best direction but within one-unit neighborhood of the direction. After moving, the
algorithm checks whether this direction (d*) is still beneficial and records the change in the
profit. Depending on the direction, either d* or —d* becomes one of two base directions (b2,
to be precise), and the newer of former base directions becomes b;. At each step, the best
direction is updated as such, making sure that we always have two linearly independent
vectors as the base directions. A detailed pseudocode can be found in Algorithm 1 in
Appendix C.

Figure 12 shows a typical output of the hill-climbing algorithm for the base game. In
this case, the algorithm first determines a good direction by taking one advertising step
downwards and one price step to the right. Then, it quickly approaches the optimum point.
The step size is decreased as the slope gets milder. Since the algorithm does not know the
location of the benchmark, and the fact that benchmark is optimum, it continues searching
for a better profit. Such a pattern is consistent with players’ behavior.

Next, we carry out simulations with the hill-climbing algorithm. Figure 13 shows the
distribution of 10,000 scores generated by the hill-climbing heuristic for the base game.
Based on the cumulative profits, it is apparent that the hill-climbing heuristic is more
consistent with experiment results than the random heuristic (see Figure 8).

13
advertising

ry

20 25 30

price

(a) Contour plot

40 0 5 10 15 20 2 30 35 40
Time (weeks)

(b) Time-dependent behavior

Figure 12: A typical output of the hill-climbing algorithm for the base game.

7

i
6e-05 | Do-nothing ||

Score
| H
' 1
i Benchmark

Bets i Density of ‘

i Hill-Climbing Score

' Simulation :
fe08y | Scores !

| :
3e-05 4 ! ;

f Density of

| Players’ :
20-05 4 | Scores \

| - i

Hl :
1e-05 4 |

i

i

oJ -

i i T T
60000 80000 100000 120000

Figure 13: Distribution of the hill-climbing heuristic scores for the base game (grey area)
versus players’ cumulative profits (individual dots) and their density (dashed line).

14
Figure 14 shows a typical output for a nonlinear version. In the nonlinear versions, the
optimum point is different and the profit surface is steeper but the algorithm works in
the same fashion. The performance of the algorithm is also similar to the performance in
the base game. As seen in Figure 15, the score distribution is close to the benchmark,
containing the experiment results. These results indicate that hill-climbing heuristic is
consistent with players’ performances for the base and nonlinear versions.

30

15 20 25 30 35 40 oO 5 10 15 20 25 30 35 40
price ‘Time (weeks)
(a) Contour plot (b) Time-dependent behavior

Figure 14: A typical output of the hill-climbing algorithm for a nonlinear game version
(N3).

7 A
4e-05 4 Hl Density of |
i Players’ i
Do-nothing yers’
Score 1 Sooes,
i t
'
3e-05 4 !
i
'
'
i
20-05 4 | Density of
1 Hill-Climbing
\ Simulation :
| ay Befchmark|
i Score
12-05 4 ' :
i
i
'
i
0+ t fi
1 1 1 1 1
0 50000 100000 150000 200000

Figure 15: Distribution of the hill-climbing heuristic scores for a nonlinear game — setting
N38 (grey area) versus players’ cumulative profits (individual dots) and their density (dashed
line).
5.2 Hill-Climbing for the Delay Game Version

The existence of delay remarkably complicates the growth management game task. First,
the results of the actions are delayed. If we want to apply the same principles that we
used in the hill-climbing algorithm described above, we have to wait until any action shows
its full effect, which is impractical when we have long and high order delays. Second, in
the growth management game, each action has two consequences: one immediate and one
delayed effect. If the player takes decisions without waiting to see their full effects, the
effects of immediate and delayed actions mix. It becomes impossible to understand what
portion of a profit change is due to an immediate decision and what portion is due to a
delayed decision.

Nevertheless, we adapted the hill-climbing algorithm to accommodate delay. We discrimi-
nate between discrete and continuous delay cases. In this paper, we only present the simpler
version: discrete-delay case. The algorithm for the delay case makes use of two sets of base
directions: base directions based on immediate information and delayed information. The
base directions based on immediate information are updated like the base directions in the
no-delay case. In doing this, we ignore the fact that a profit change is also affected from
an earlier decision through delayed effect. In calculating the base direction based on de-
layed information, the algorithm assumes that T-weeks-earlier decisions are realized in this
week. However, in doing this, we ignore the fact the last week’s decision is also effective
on this week’s profit. To compensate for these problems, we use a weighted average of
delayed and immediate information to determine the direction of movement. Algorithm 2
in Appendix C presents a pseudocode of the heuristic.

Figure 16 shows a typical behavior for the discrete-delay hill-climbing algorithm when

5: 10 15 20 25 30 35 40 cv) 5 10 15 20 25 30 35 40
price Time (weeks)
(a) Contour plot (b) Time-dependent behavior

Figure 16: A typical output of the hill-climbing algorithm for the discrete delay game
version with delay time of 4 weeks and weight of delayed information 0.5.

16
weights of delayed and immediate information are equal. Note that, the contour plot in
the delay case does to show the instantancous level of profit, but the equilibrium level that
will be reached eventually if no further action is taken. For the delayed case, the algorithm
is not able to show a clear progress.

Figure 17 shows the distribution of hill-climbing simulation scores versus random heuristic
scores. It is clear from the figure that the hill-climbing heuristic is not superior to the simple
do-nothing strategy. This conclusion is further verified by experimenting with different
delay orders, delay times and weights.

Do-nothing |
Score 1
5e-05 + ! Density of
i Random
i Simulation
4e-05 4 ' Scores !
' Benchmark
' Score
3e-05 '
Density of \
Hill-Climbing i
2e-05 7 simulation '
Scores '
'
16-05 +
f
|
i
" :
0 20000 40000 60000 go000 100000 -~=— 120000

Figure 17: Distributions of hill-climbing scores versus random scores for the discrete delay
version (ODT4).

6 Conclusion

In this paper, we use statistical distributions of scores generated by decision heuristics
to evaluate the results of gaming experiments with a growth management game involving
different complexity factors. We develop a random “decision rule” in which the heuristic
takes random decisions in every game step and the score is calculated by feeding these
random decisions to the game. We compare the resulting score distribution with real
players’ scores. We discover that in the experiments of the base game (which does not
have any complexity element) and the nonlinear game (which only includes nonlinearity
between action and its effect), real subjects outperform the random decision heuristic. On
the other hand, for the game versions involving delay, player results and random heuristic
results are close. Indeed, considerable amount of players perform worse than the mean
of the random heuristic score distribution. There are even players performing worse than
the do-nothing strategy. These results show that for the growth management game, delay
creates an environment that renders rational decision-making ineffective.

We next propose a hill-climbing decision heuristic for the growth management game. The
heuristic addresses the problem of finding steepest ascent in two-dimensional feasible space
with limited information. It only uses the information that is available to the player and
exhibits a quick progress. The algorithm is stochastic because of the randomnes

17
sizes and deliberate minor deviations from the best direction of movement. By running
the algorithm numerous times, we obtain a score distribution. The results show that the
hill-climbing heuristic is a good representation of players’ decision strategies for the base

and nonlinear game versions.

The hill-climbing heuristic algorithm is modified to account for delays. However, the nature
of delay and some special properties of the growth management game make it very difficult
to develop an effective heuristic. The results of the hill-climbing algorithm in this case
are not superior to the results of the random heuristics. This outcome is in parallel with
our previous observations about the delay game: not only subjects’ mental heuristics fall
behind random heuristics, but also an “intelligent” heuristic cannot perform well. Despite
its poor performance, the hill-climbing heuristic for the delay version can constitute a first
step toward a more sophisticated decision rule.

Throughout the paper, we illustrate how the score distributions can be used for different
purposes. First, we use the random score distribution as a point of reference to compare
different versions of a game, which are difficult to compare due to differences in the game
structures. Second, we use score distributions resulting from the hill-climbing heuristic
simulations to compare its performance for different versions and under different settings.
In addition, it is possible to use the simulation outputs to represent experiment results
if the heuristic used is a good-enough representation of the players’ decision heuristics.
However, such a substitution requires an elaborate experimental study by its own.

References

Bakken, B. E., 1993, Learning and Transfer of Understanding in Dynamic Decision Envi-
ronments, Ph.D. Dissertation, Massachusetts Institute of Technology.

Barlas, Y. and M. G. Ozevin, 2004, “Analysis of Stock Management Gaming Experi-
ments and Alternative Ordering Formulations,” Systems Research and Behavioral Sci-
ence, Vol. 21, pp. 439-470.

Diehl, E. and J. D. Sterman, 1995, “Effects of Feedback Complexity in Dynamic Decision
Making,” Organizational Behavior and Human Decision Processes, Vol. 62, No. 2, pp.
198-215.

Grofler, A., F. H. Maier and P. M. Milling, 2000, “Enhancing Learning Capabilities by
Providing Transparency in Business Simulators,” Simulation & Gaming, Vol. 31, No. 2,
pp. 257-278.

Kampmann, C. E., 1992, Feedback Complexity and Market Adjustment: An Experimental
Approach, Ph.D. Dissertation, Massachusetts Institute of Technology.

Moxnes, E. and A. K. Saysel, 2009, “Misperceptions of global climate change: information
policies,” Climatic Change, Vol. 93, pp. 15-37, 10.1007/s10584-008-9465-2.

Ozgiin, O. and Y. Barlas, 2011, “Analysis of the Effects of Different Complexity Factors
on the Complexity of a Simulation Game,” 29th International Conference of the System
Dynamics Society, Seoul, Republic of Korea.

18
Paich, M. and J. D. Sterman, 1993, “Boom, Bust, and Failures to Learn in Experimental
Markets,” Management Science, Vol. 39, No. 12, pp. 1439-1458.

Rouwette, E. A. J. A., A. Grégler and J. A. M. Vennix, 2004, “Exploring Influencing
Factors on Rationality: A Literature Review of Dynamic Decision-Making Studies in
System Dynamics,” Systems Research and Behavioral Science, Vol. 21, pp. 351-370.

Sterman, J. D., 1987, “Testing Behavioral Simulation Models by Direct Experiment,” Man-
agement Science, Vol. 33, No. 12, pp. 1572-1592.

Sterman, J. D., 2000, Business dynamics, McGraw-Hill New York.

Trees, W. S., J. K. Doyle and M. J. Radzicki, 1996, “Using Cognitive Styles Typology
to Explain Differences in Dynamic Decision Making in a Computer Simulation Game
Environment,” Proceedings of the 14th International Conference of the System Dynamics
Society, Cambridge, MA, U.S.A.

Yang, J., 1996, “Facilitating Learning through Goal Setting in A Learning Laboratory,”
Proceedings of the 1996 International System Dynamics Conference, Cambridge, MA,
U.S.A., The System Dynamics Society.

19
Appendices

A Equations of the Growth Management Game

The sales amount for week t is:
St = 80 f (Pr) g(a) (4)

where f(-) and g(-) are effect functions of price and advertising, respectively. Their func-
tional forms are shown in Figure 18.

2 3
2
gis 3
gu 2
: 2
a g 2
— Upseccsst . 2
3 =
2 ' 3 1t----y
{ 3
“os i 2
\ i

(a) f(-): Effect of price (b) g(-): Effect of advertising
Figure 18: Linear functions showing the effects of price and advertising on sales.

Weekly revenue and profit are computed as:

R= mS (5)
TT = R,.-bS,;-cauy (6)

where c is the cost of advertising per minute per week and b is the production cost per item

sold.

20
B- Random Score Distributions for Different Game

Versions
6e-05 4 ; ;
— H Density of :
e- 7 N1 .
| :
te-05 | |] \ARandom Scores}
I ‘
3e-05 4 | ‘
2e-05 + :
1e-05 4 :
o4 o_o
; ; ; ; ; ;
0 50000 100000 150000 200000 250000
! ;
4e-05 7 Density of ‘
N2 ‘
3e-05 4 a Frandom Scores :
I :
2e-05 5 | :
I :
1e-05 + | ‘
04 | =<
; ; ; 1 ; ;
0 50000 100000 150000 200000 250000
H Density of q
3e-05 4 | N3 ;
| Random Scores :
2e-05 4 ‘
q
1e-05 4 | :
| :
04 oosiew!
1 ; ; ; ; ;
0 50000 100000 150000 200000 250000
:
; Density of ‘
2e-05 + N4 :
|
I a Param Scores ‘
| :
16-05 4 H
| :
! i
Ot = ; ; se
0 50000 100000 150000 200000 250000

Figure 19: Densities of random scores for nonlinear game versions.
8e-05 4 | Density of + Density of
I O1T2 + I O1T4
en | Random Scorgs | Random Scores
| : | ‘
| : | :
4e-05 4 | : | :
' | :
2e-05 4 : |
| ; | :
\ : | :
0-5 ‘ ‘
8e-05 + I Density of Beneii-ce
H O1T6 lensity of
4andom Scores O1T7
6e-05 4 | ; Random Scores
| ‘ ‘
| :
4e-05 4 ‘ :
|
26-05 + ‘ :
| : ‘
1 ‘ :
0-7 q “
8e-05 5 |
Density of | Density of
O1T8 O1T9
6e-05 4 “Random Scores A Random Scores
: | :
: | :
4e-05 + : :
i :
2e-05 + : :
A | ‘
. I .
04 ¢ ‘
8e-05 5 | I
| Density of Density of
| 01710 I 01711
6e-05 1 Random Scores 1 Random Scores
| i | ‘
| : | :
4e-05 ' 5 i .
: :
2e-05 + i |
| F | :
. i] .
07 t ‘

T T T T T T T
0 40000 80000 120000 0 40000 80000 120000

Figure 20: Densities of random scores for game versions involving first order delay.
Density of
O3T4

I
I
I
88-05 Random Scores
| :
4e-05 + | ‘
| :
| :
2e-05 4 :
| ‘
I ‘
04 o
Density of Density of

a O3T6
R

O3T7

Random Scores

6e-05 4 andom,Scores :
4e-05 4 : :
2e-05 4 ‘ ‘
04
Density of H
0378 | Density of
ene random Scores 1 | \,- oss
‘ Random Scores
‘ I :
4e-05 4 : ‘
: | ‘
; | :
2e-05 : :
|
04 : te
| I
Density of | Density of
6e-05 + 1 | \g-<, 03710 | O3T11
| Random Scores Random Scores
4e-05 4 f : H :
I : I :
| ‘ | :
2e-05 4 : | {
| : } :
d | ‘
0-4 co ‘

0

T
40000

T T T
80000 120000 0

T
40000

T T
80000 120000

Figure 21: Densities of random scores for game versions involving third order delay.
6e-05 5

4e-05 +

Density of
OST6

‘andom Scores

A

Density of
O5T7

‘andom Scores

Density of
O5T8

andom Scores

Density of
O5T9

andom Scores

A

Density of
O5T10

andom Scores

Density of
O5T11

andom Scores

Figure 22: Densities of random scores for game versions involving fifth order delay.

0

T T
40000 80000

T T
120000 0

T T
40000 80000

T
120000

| Density of ‘ :
1 ODT2 ‘ I Density of +
6e-05 5 | Random Scores! | ODT4 ;
: | Random Scorés
4e-05 4 ; :
| ' |
| : |
2e-05 4 | : |
\ : |
0-5 ni
Density of | Density of
60-05 4 | ODT6 ; | ODT7:
Random Scores. | Random Sc¢res
| : |
4e-05 4 :
| : |
| : |
2e-05 4 | : |
| ‘ |
\ : \
074 n
| 4 I Density of +
A | Density of | ODT9 :
e- | ODT8 ‘ H Random Scores
| Random Scojes |
i I
4e-05 4 |
| |
| |
2e-05 4 | |
|
|
04
Density of } 1 Density of
| _ ODT10: | ODT11:
6e-05 4 | Random Scores I Random Scores
| : | ‘
| : | :
4e-05 4 f : H :
| ‘ | :
| : | :
2e-05 4 | ‘ | ‘
: ;
0-5 +

T T
0 40000 80000

T T
120000 0

T
40000

T T
80000 120000

Figure 23: Densities of random scores for game versions involving discrete delay.

C Hill-Climbing Algorithm Pseudocodes

Algorithm 1 hill-climbing heuristic for the base and nonlinear versions

Require: EFFECTPRICE(pn) > function defining effect of normalized price, py on sales
Require: EFFECTADVTS(a;) b> function defining effect of normalized advertising, a,
Require: UNIFORM(a, b) > a function generating Uniform random variates between a and b

Require: Pyrins Pars mins mas > allowed limits for price and advertising in the game

function PROFIT(p, a)

return 100 - (EFFECTADVTS(a/10) - EFFECTPRICE(p/20)) - p — 50-a—5-100- (EFFECTADVTS(a/10) -
EFFECTPRICE(p/20))
end function

function PICK_DIRECTION() > picks a random direction 1 or -1
if UNIFoRM(0, 1) < 0.5 then return —1
else return 1
end if
end function
function PICK_STEPSIZE_CONT(min
return UNIFORM(minss, max
end function
function PICK_STEPSIZE_DISC(minss, maxss) b picks a random integer ste

,maxss)> picks a random real stepsize between minss & maass

return | UNIFORM(minss — 0.5,maxss + 0.5)]
end function
function FIND-COMBINED_DIRECTION(p1, @1, 91, P2, 42, 92) > determines best direction by taking
linear combination of two unit vectors and their gains per unit change

if det bs #0 then
P2

ay
a

wy, — a we a le
Uy Tram 2 Tw my *¥
(91 + 92) (91 + 92)
( rs ) wit ( ba ) wa
ay a2
return, 22+ +—_
else
return

end if
end function
function EFFECTOFSLOPEONSTEPSIZE(Slope)

end function

26
Initialize

m-(s)m-(8)

ug < 0, uge + 0

-()

t-—0
a, + 10, pp — 20
Tl, + PROFIT(p2, az)

0
0

Step 1
tett+1
if UntroRM(0, 1) < 0.5 then

picked — p
else

Pa * PICK_DIRECTION()
Pss ~ PICK_STEPSIZE_DISC(2, 5)
aq <— PICK_DIRECTION()
ss <- PICK_STEPSIZE_DISC(2, 5)

if picked = p then

Pt — MaX(Pmin, WiN(Pmax,Pt—1 + Pa Pss))

ap — 1

Pez. — Pe

Qy41 — max(Amin, MIN(Amax, At + Ga - Ass)
else

Q& &— MaAX(Anin, MIN(Amars U1 + Ad * Ass)

Pt Pt-1

Oty — at

Pisi  Max(Pinin, MIN(Pmaxs Pr + Pa Pss))
end if

Tl, < PROFIT(p¢, a1)
T41 — PROFIT(p:41, @¢41)

- — Pea) -s@n(Ty — Th +8) by
b (be — pri) - sgn — Tea ; b
mm ( (a, ~ a1) -sen(, Tha +e)??? Tey
~ — pm) + sgn(II41 — Ty, + €) be
Bb. (Pisa Pr) sen rq , Dee
2 ( (arpa ~ ax) « sgn(Tliy1 — Th +e) 2 Tbe ll

d* < FIND-COMBINED_DIRECTION(b1 1, 61,2, Ugi, b2,1, b2,2, uga)

> Advance two weeks by taking one price and one advertising dec

> randomly pick price or advertising as the s

> two base directions
> base unit gains for each direction

> best direction

> initial conditions of the game

sion
bt=1

arting variable to change

> pick a random direction for price

> pick a stepsize between 2 and 5

> pick a random direction for advertising
> pick a stepsize between 2 and 5

> In the first step advance p

> Keep a constant

> In the second step keep p constant
> Advance a

> If ais picked, do the opposite

> find the best direction

27
Step 2

te-t+2 bt=3
while t < 40 do > Repeat for all remaining weeks
88 ¢<~ PICK_STEPSIZE_CONT(1, 4)-EFFECTOFSLOPEONSTEPSIZE(uga) > Pick a step size
Pt — Min(Pmax,MAaX(Pmin, |Pr_1 + dj « ss] + [UNIFORM(—1, 1)])) > Calculate next p
ay & Min(Amax,MAX(Amin, |ae—1 + d3 - ss] + [UNIFORM(—1, 1)])) > Calculate next a
Tl, <- PROFIT (p;, a4)
if pp A pe V a # yy then > Update base directions if moved
bi + be
ugi *— ug2
Bye ( (—Pea) sen -Thate)) be
(ai — ax—1) «sgn(Tly — Tra + €) || be |]
ug, — Ty = Th
Lg2
|| be ||
end if
d* <— FIND_COMBINED_DIRECTION(by,1, 1,2, ug1, 62,1, b2,2, ug2)
tet+1
end while

28
Algorithm 2 Hill-climbing heuristic for the discrete delay version

> order of the delay present in the game
Require: T, > duration of the delay present in the game
Require: Tj, > duration of the delay assumed by the hypothetical player in the heuristic
Require: wae > weight of the delayed information in making decisions
Require: EFFECTPRICE(p,,)

Require: EFFECTADVTS(a,)

Require: UNIFORM(a,b)

Require: FIND_COMBINED_DIRECTION(p1, @1, 91, P2, 42, 92)

Require: PICK_STEPSIZE_CONT(minss, maa
Require: PICK_STEPSIZE_DISC(mi:

Require: O,

3)

ma:

function PROFITD(p, a, eopg,eoa,) > calculates delayed profit given current price, advertising, delayed
effect of price and delayed effect of advertising

return 100 - (copa - eoag) - pp — 50-a — 5 - 100 - (copa - eoaa)
end function

Initialize
bie ( ; } be + ( ; ) > two base directions
ugi — 0, uge +0 > base unit gains for each direction
" 0 a
dtc 0 > best direction
t-—0
a, + 10, pp — 20 > initial conditions of the game
Tl, <- PROFIT(p¢, ae)
copa 1, eoag +1 > variables keeping delayed effect of price and advertising
Step 1
tet+1
if UNIFORM(0,1) <0.5 then > randomly pick price or advertising as the starting variable to change
picked — p
else
picked —a
end if
Pa + PICK_DIRECTION() b> pick a random direction for price
Pss #~ PICK_STEPSIZE_DISC(2, 5) > pick a stepsize between 2 and 5
aq <~ PICK_DIRECTION() > pick a random direction for advertising
dz, <~ PICK_STEPSIZE_DISC(2, 5) > pick a stepsize between 2 and 5

if picked = p then
Pt — Max(Pmin, MIN(Pmax,Pt-1 + Pa * Pss))

Qt + Ge_1
Ptzi — Pe
dip — MaX(Amin, MIN(Amax, 1 + da * Ass)
else
ay — max(Amin, MiN(Amaz, 4-1 + Aa * Ass)
Pt Pt-1
Qt41 — a
Pest — Max(Pmin, MiN(Pmax, Pt + Pa* Pss))
end if
for 7 =2 to 3do > 4-step routine for updating delay stocks and calculating current profit
for w = 11 to 2 do > update delay order stocks by using lower order stock’s level as input

eoadel st.

eoadelst., — eoadelst., +

é
eipilelaty — eopdelst,, + —

end for

29
EFFECTADVTS(a[r]/10) — eoadelsty
T,]0,
EFFECTPRICE(p|r]/20) — eopdelsty
Ty/Og ; ,
> update first order stock by using the current func!
Il, + PROFITD(p,,a;,e0pa,€0dq) > calculate the weekly profit by using the last updated value of
the delayed effect function for the corresponding delay order
eoag < —eoadelsto,, eopa < —eopdelsto, b> update the delayed effect functions
end for b end of 4-step routine

eoadelst; <~ eoadelst, +

eopdelst, — eopdelsty +

> initialize two base directions based on immediate information
‘imam
am ( (pe — Pea) sgn(H, — Tha + €) i pimm , _ br

, (a; — a¢_1) - sgn(T, — Ta +€) * | be |
biom c ( (Pe ~ pe) -sen(Iey1 ~ Te +) pimm Pa
(ay44 — a,) -sen(II,41 — Uy +) lo]
imm _ Ue ~ Mea sim, Devi — Te
Og Tp? WE i
dimnm # FIND-COMBINED pinwevion(bi"9", ug, bt, bs'5™, ugs""”") > Find the best dit

based on immediate information

Step 2
while t <3 do

ift>T),+1then > if enough time has passed, initialize directions based on delayed information

if (pi_-7, # Pr—7,-1) V (a7, # “1-7, -1) then > if either price or advertising has changed
Th-wk-ago, use new information
if || b$*! |= 0 then > if first base direction is still zero, use T),-wk-old data to initialize it

jdel
Beet (Pe—n, — Pe—7,-1) - sgn(Tle — Ty-1 + €) pdel by
‘ (an, — a1, -1) - sgn(I, — Hy-1 + €) | bee! ||
Tl, — Ty-1
del
ugy “TT aeL
|| bys" ||
else if || be! |= 0 then p if first base direction is nonzero but second base direction is zero,
use Th-wk-old data to initialize second base direction

pael
bet ( (pen, — Pe—7,-1) - sgn(T — Ty-1 + €) ). peel bg°

(a, — @1—r,-1) - sgn(Tly — Tra + €) || bee! |
uglel — Th, ~ W-1
rn _
else > if both base directions are nonzero, use T;,-wk-old data to update the directions
peel c peel
ug! e gle!
pael
pael (py—7, — Pe—7,-1) * sen(I, — Th + €) del b*
bgel bge! — —-__
(an, — a—7,-1) -sgn(T, — Ty-1 + €) |b" |
gil & Th ~ Ta
92 Thee
bie" |
end it

d§_) — FIND_COMBINED_DIRECTION (bf, bf}, ug#*!, 9%, b§3, ug9"') > Find the best direction
based on delayed information
end if
end if
tett+1
end while

30
Step 3
while t < 40 do

d & weet Ae, + (1 — waet) diam > Calculate the direction of movement
d

dte Tan > divide by its length to have a unit vector

8s < PICK_STEPSIZE_CONT(1, 4) > Pick a step size

Pt — Min(Pmax,MAaX(Pmin, [Pri + dj - ss] + [UNIFORM(—1, 1)]))
a — Min(Amax,MAX(Amin, |ae—1 + d3 + [UnirorM(—1,1)]))
> 4-step routine for updating delay stocks and calculating current profit

for w = 11 to 2 do

eoadelst,,_; — eoadelst,,
Ty/Og

1 — eopdelst.,

T,/Og

eoadelst., ~ eoadelst,, +

copde
eopdelst., < eopdelst,, +

end for
EFFECTADVTS(a[r]/10) — eoadelst,

eoadelst,  eoadelst, +

T,/Og
eopielans eopileleey + EFFECTPRICE(p{7]/20) — eopdelst,
T,/O,
Il,  PROFITD (py, az, €opa, €0a)
eoag < —eoadelsto,, eopa < —eopdelsto, b end of 4-step routine
if (pp A Pe-1) V (ar A qy-1) then > Update base immediate directions if moved

imm __ pimm
bye <— by
imm

ugi?™ — ugi

pimm — ( (Pe — Pra) sen(Il; — Ta + €) uel _ bpm
‘ (ar — ara) -sen(Ihy — Ha +e) )? PP" Tem
imm _ We ~ Tra
TPT
end if
diam — FIND-COMBINED_DIRECTION (bY, bY”, ugy™”, b5'1”", bg", uggs”) > Find the best

ion based on immediate information
ift>T7, +1then  » if enough time has passed, initialize directions based on delayed information

if (mn, # Pr—7,-1) V (a7, # 41-741) then _ bf either price or advertising has changed
Tywkeago, use new information
if || bee! |=0 then _ > if first base direction is still zero, use T)-wk-old data to initialize it
pael
paet — ( (Pet ~ Pr—m,-1) - sen (I, ~ Tha + ©) pdel bye
‘1 = -sen(II, — II, te) }” 1 pdel
(a1, — @—7,-1) - sgn(Te — Mya + € || bee! ||
fet «Hea Ms
gh peer
I bg* ||

else if || b9¢! |= 0 then p if first base direction is nonzero but second base direction is zero,
use Ty-wk-old data to initialize second base direction
jdel
beet (pen, — Pe—7,-1) * sgn(TIe — Iya + €) pg be
(an, — a—7,-1)  sgn(Ty — Hy-1 + €) || bee! ||
Tl, — Ty-1
del
| bre | . .
else > if both base directions are nonzero, use Tj,-wk-old data to update the directions
pdel = pdel
‘1 2

gl! & gf!

pael
paet — ( (Pen, — Pr—m,-1) sen —Th-a +e) ) paar bs°
? (a7, — @1—r,-1) - sgn(Ty ~ Hy-1 + €) 2
Tl, — Ty-1
gs" |

ugg! —

[| bs" ||
ugg! —

end if
dig — FIND_COMBINED_DIRECTION (b$¢, b¢4, ug?! bg, bg, ug!) > Find the best direction
based on delayed information
end if
end if
tet+l1
end while

Metadata

Resource Type:
Document
Description:
In this study, using different versions of a growth management game involving two different complexity factors, we compare performances of heuristic rules with experimental results. We present a method for obtaining a statistical distribution of scores resulting from a given simulated decision heuristic, which can be used to compare against and assess experimental gaming results. The method is based on the idea of generating vast number of scores by stochastically simulating a given decision rule and obtaining the resulting score distribution. We use this method to compare scores from different game versions whose scores are essentially not comparable, and to see how the score distributions change from one game version to another. In simulations, we first use a simple random "decision rule" and then develop a more intelligent hill-climbing heuristic. The results show that when the games involve delay, human subjects do not perform better than the random heuristic —a primitive rule composed of a sequence of random decisions. On the other hand, in nonlinear games, subjects outperform the random heuristic and their scores fit better the score distribution of the hill-climbing heuristic. We also demonstrate how the score distribution from random heuristic can be used as a reference performance measure.
Rights:
Date Uploaded:
January 1, 2020

Using these materials

Access:
The archives are open to the public and anyone is welcome to visit and view the collections.
Collection restrictions:
Access to this collection is unrestricted unless otherwide denoted.
Collection terms of access:
https://creativecommons.org/licenses/by/4.0/

Access options

Ask an Archivist

Ask a question or schedule an individualized meeting to discuss archival materials and potential research needs.

Schedule a Visit

Archival materials can be viewed in-person in our reading room. We recommend making an appointment to ensure materials are available when you arrive.