Agents
This sector gives the definition and relation between agents and relative elements.
Agent: an entity that has goals and preferences and tries to perform a series of actions that yield the best/optimal expected outcome given these goals.
Environment: Rational agents exist in an environment, which is specific to the given instantiation of the agent.
World: an environment and the agents that reside within it create a world.
The environment for a checkers agent is the virtual checkers’ board on which it plays against opponents, where piece moves are actions.
Reflex agent: An agent does not think about the sequences of its actions, but rather selects an action based solely on the current state of the world.
Planning agent: The planning agent maintains a model of the world and uses this model to simulate performing various actions.
Reflex agents are typically outperformed by planning agents.
One agent can determine hypothesized consequences of the actions and can select the best one. This is simulated “intelligence” in the sense that it’s exactly what humans do when trying to determine the best possible move in any situation.
PEAS: Performance measure, Environment, Actuators, Sensors. PEAS is used to define the task environment. The performance measure describes what utility the agent tries to increase. The environment summarizes where the agent acts and what affects the agent. The actuators and the sensors are the methods by which the agent acts on the environment and receives information from it.
Type of environment:
- Partially observable environment v.s. Fully observable environment
In the Partially observable environment, the agent does not have full information about the state. The agent must have an internal estimate of the state of the world. E.g. the agent in dota2 or League of Legends, the agent does not know the state of opponents in some moments. In the contrast, a fully observable environment means the agent can know everything(has full information) about the state. E.g. the agent in a chess game. - Stochastic environment v.s. deterministic environment
A Stochastic environment has uncertainty in the transition model, which means taking an action in a specific state may have multiple possible outcomes with different probabilities. But in a deterministic environment, the action in a state has a single outcome that is guaranteed to happen. - Multi-agent environment v.s. singleton agent environment
The agent in the multi-agent environment needs to get along or act along with other agents. - Static environment v.s. dynamic environment
If the environment does not change as the agent acts on it, the environment is called static. The dynamic environment changes if the agent interacts with it.
State Spaces and Search Problems
We will give a mathematical expression to describe the given environment in which the agents will exist.
Search problem: Given our agent’s current state, how can we arrive at a new state that satisfies its goals in the best possible way? A search problem contains the following elements:
- A state space: the set of all possible states that are possible in your given world.
- A set of actions is available in each state.
- A transition model: Outputs the next state when a specific action is taken at the current state
- An action cost: Describe the cost when moving from one state to another after applying an action
- A start state in which an agent exits initially
- A goal test: A function that takes a state as input and determines whether it is a goal state
One search problem is solved by first considering the start state, then exporing the state space using the action and transition and cost methods, iteratively computing children of various states until we arrive at a goal state(typically called a plan).
The order in which states are considered is determined using a predetermined strategy. We will cover types of strategies and their usefulness shortly.
World State v.s. Search State
The world state contains all information about a given state, whereas a search state contains only the information about the world that is necessary for planning (priarily for sapce effiency reasons.
Example: Pacman
The game of Pacman is simple: Pacman must navigate a maze and eat all the (small) food pellets in the maze without being eaten by the malicious patrolling ghosts. If Pacman eats one of the (large) power pellets, he becomes ghost-immune for a set period of time and gains the ability to eat ghosts for points.
Think about a case that the maze contains only Pacman and food pellets. There are two distinct search problems in this scenario: pathing and eat-all-dots.
Pathing tries to solve the problem of getting from one position\((x_1,y_1)\), to position \((x_2,y_2)\) in the maze optimally.
- pathing
- States: (x,y) locations
- Actions:(Move to) North, South, East, West
- Transition model(getting the next state): Update location only
- Goal test: Is (x,y) = END?
Eat-all-dots tries to solve the problem of consuming all food pellets in the maze in the shortest time possible.
- Eat-all-dots
- States: (x,y) location, dot booleans
- Actions: (Move to) North, South, East, West
- Transition model(getting the next state): Update location and booleans
- Goal teset: Are all dot booleans false?
A world state may contains information like the total distance traveled by Pacman or all positions visited by Pacman. Pathing states contains less information than states for eat-all-dots.
State Space Size
The state space size determines the complexity of solving a search problem. To find the state space size, we need to use fundamental counting principle, which states that if there are n variable objects in a given world which can take on \(x_1,x_2,…,x_n \) different values respectively, then the total number of states is \(x_1\cdot x_2…\cdot x_n\).
Example: Pacman
Assume the random variable objects and their corresponding number of possibilities are as follows:
- pacman positions: Pacman can be in 120 distinct (x,y) positions, and there is only one Pacman.
- Pacman direction: Pacman can move to North, South, East, or West, for a total of 4 possibilities.
- Ghost positions: There are two ghosts, each of which can be in 12 distinct (x,y) positions
- Food pellet configurations: There are 30 food pelets, each of which can be eaten or not eaten.
We know that we have 120 pacman positions, 4 paceman direction possibilities 12 x 12 ghost configurations, and \(2\cdot2\cdot…\cdot2=2^{30}\) food pellet configurations (each of 30 food pelletss has two possible values – eaten or not eaten.) This gives us a total state space size of \( 120\cdot 4 \cdot 12^2 \cdot 2^{30}\)