Experiments in nudging players

Player modeling can be used to train NPCs and bots, to dynamically customize the gameplay (for example, an enemy’s strategies could change based on play style), and to aid testing and level design. In our 2013 AIIDE paper, we proposed a simple probabilistic method for modeling players that could be used to bias players towards certain behaviors. The underlying assumption is that players tend to act certain ways based on what is available in their environment. Thus, if we know the relationship between player behavior and environment, we can tweak the environment to encourage people to behave in certain ways.

In other words, we can model what players do where and then use this information to nudge player behaviors in desired ways. There may be applications for this beyond testing and data collection — video games are unique in that we have absolute control over the environment we present to players. For example, could we better understand what environmental/game incentives either encourage (or discourage) PVP? What differences are there between free-to-play players and subscribers? What do players tend to do at max level?

For the paper, we specifically looked at a straight-forward application of this idea for collecting player metrics. Such an approach could reduce the number of games playtesters need to run because it would allow them to focus on collecting data only for the metrics which need it most. For our proof of concept, we implemented several dynamically configurable environments in Second Life and collected several very simple behavior metrics: the distances between people standing in either narrow or wide spaces; the timing of lane crossings for slow and fast traffic; and the choice of whether to use a health kit based on health level.

Below are two screenshots from two of our experiment setups (top: a space environment; bottom: an office environment), in which players race around to collect tiles for prizes.



Using our player model, we formulated the question of which game to run next as an optimization problem (as an MDP), would run more games, and finally update our player model using the results. Even without running the optimization, one can look at the statistics to see what behaviors occur most frequently in which environments. Even in our straight-forward setup, our assumptions about what players would do were often wrong! We also showed that our optimization-based scheduler did reduce the number of games needed to run, when compared to a schedule which played all scenarios equally. However, there are caveats and limitations to the approach which are worth reading about in the paper.

Even for testing, we didn’t get a chance to explore these ideas further, but I always envisioned it having potential in large, open-world multiplayer environments where it’s particularly difficult to work out every glitch or even understand apriori how players will interact with each other and the game (although companies do an amazing job). For example, could this approach help debug aspects of the environment that lead to trapping players in walls (e.g. do environments where players become trapped have certain shared characteristics? or are the aspects of the character (such as speed at impact) cause the character to become stuck)? Could this approach help debug navigation problems for companion characters, who may block and trap the player in certain areas? After all, once a problem is understood well-enough to be reproducible, it’s often straight-forward to fix.