Much has been written about perfect information games like chess, and I am reviewing literature on imperfect games like hearts. As many of you know, I am a huge fan of Battlestar Galactica (BSG) and the associated board game. I believe there is a level beyond imperfect games that requires deception. Honestly, it seems like a bad idea to teach machines how to lie, but as a hacker, I am focused on building a worthy adversary.
Deception in games requires a long game and a powerful reveal. For example, “Oh no! Jeff was the Cylon, we are screwed!!!” The key problem with deception games is that the player must be competently good without anyone suspecting that they will eventually betray the group. There is a point when it becomes important to reveal the deception in order to accelerate the pending demise, but if done too soon, the deception will be less effective.
I find that the literature on this topic is insufficient, as many games build neural networks that aim to go from one state to a better state, either minimizing or maximizing various factors. Deception requires embracing mediocrity, or “meh” moments.
I am not an expert in AI as I left my education to build a company and enter the tech industry as an infrastructure professional. I am enjoying the beginner mindset and am currently in the conceptualization phase for a game that I plan to ship next year. As part of this process, I would like to improve solo play in the game.
Rather than focusing solely on this one game, I am looking at solving the problem of artificial players in all games via a holistic manner. This is not the best approach, but I intend to build my game in the hardest way possible.
The Adama Platform simplifies the AI game loop by asking each player a question along with a list of options. Essentially, every game can be seen as a multiple-choice test with a potentially large number of choices. This means that I already have the worst AI available: a random bot. From a testing perspective, this is useful as I was able to use a random bot to quickly find bugs in my implementation of BSG.
The open question is how to leverage this “super power” to improve the game. One potential solution is to use machine learning to build a function, f(S, D, O), that takes the current state of the game, S, a labeled decision, D, and a set of options, O, and selects the best option to play.
In the beginning, I may use classical AI techniques like building a game tree to improve the performance of the AI in the game. However, for imperfect games, this may be considered cheating as the AI would have the potential to learn the future. For example, the AI operating on a deck of cards may overfit, giving the illusion of foresight. While this capability may be useful, it is important to prevent overfitting in order to create a fair and challenging game.
To prevent overfitting, I am considering introducing a new Adama primitive called “@fuzz” that allows developers to shuffle decks, reset the random number generator, and perform other actions to mix up the hidden information. This would help prevent overfitting and allow the AI to learn and adapt to different situations. This new “@fuzz” event would look something like this:
@fuzz {
int new_order = 0;
(iterate _deck
where place==Place::Hidden
shuffle).ordering = new_order++;
}
This approach has the downside of violating some previous state transitions (such as a player placing a card on the bottom of the deck). However, the goal is to build a set of possibilities that can convert imperfect information into perfect information and create a distribution.
The first task is to build a prediction function, p, such that p(S, D, O) yields an estimated future state, S’, and estimated options, O’, and next decision, D’. The platform should allow a random bot to play the game without guidance in order to build a prediction function that can help the AI understand the consequences of its actions. This will allow the AI to “imagine” how the game may unfold and operate on that using simple techniques.
The next challenge is picking a strategy of play. Here, the developer can guide the AI by mixing goals to minimize, maximize, or preserve. I am not sure how to teach the AI to learn strategies, and I suspect I will need to use the Adama language to provide hints and guidance.
// some resource of concern
public int resource;
// tell the AI system that this is an objective
@objective resource;
This approach may feel like cheating, but it is similar to describing the end conditions within the ruleset of the game. It may be worth considering that everything in the state can be seen as an objective, so a language extension may not be necessary. This creates a new decision problem for the AI, which must determine how to tune the variables (minimize, maximize, preserve, or ignore) and in what priority order. This may require learning a new function, w(S, S’), which weighs the importance of moving from the current state to a future state.
Conceptually, if I can learn enough about machine learning to build the p and w functions, then I can create a simple game imagination that allows the AI to make decisions based on imagining many steps into the future. I am not sure if this is the right approach, but I am interested in hearing your thoughts on this.
Follow me on twitter to keep updated on the platform, or join the discord and become part of the community I’m building.