Grid Representation in Neural Networks for Automated Planning

Michaela Urbanovsk

a and Anton

ın Komenda

Department of Computer Science (DCS), Faculty of Electrical Engineering (FEE),

Czech Technical University in Prague (CTU), Karlovo namesti 293/13, Prague, 120 00, Czech Republic

Keywords:

Classical Planning, Domain-independent Planning, Neural Networks, Problem Representation.

Abstract:

Automated planning and machine learning create a powerful combination of tools which allows us to apply

general problem solving techniques to problems that are not modeled using classical planning techniques. In

real-world scenarios and complex domains, creating a standardized representation is often a bottleneck as it

has to be modeled by a human. That often limits the usage of planning algorithms to real-world problems.

The standardized representation is also not a suitable for neural network processing and often requires further

transformation. In this work, we focus on presenting three different grid representations that are well suited

to model a variety of classical planning problems which can be then processed by neural networks without

further modiﬁcations. We also analyze classical planning benchmarks in order to ﬁnd domains that correspond

to our proposed representations. Furthermore, we also show that domains that are not explicitly deﬁned on a

grid can be represented on a grid with minor modiﬁcations that are domain speciﬁc. We discuss advantages

and drawbacks of our proposed representations, provide examples for many planning benchmarks and also

discuss the importance of data and its structure when training neural networks for planning.

1 INTRODUCTION

Machine learning and usage of neural networks in the

classical planning ﬁeld is an interesting research di-

rection of the recent years. Combination of powerful

automated planning techniques and machine learning

techniques such as neural networks promise great re-

sults towards general artiﬁcial intelligence. The plan-

ning algorithms provide strong problem solving tools

and the neural networks provide generalization to un-

seen inputs based on learned knowledge. The task of

combining the two comes with many challenges and

one of them is determining structure of the data.

Formulating a planning problem for off-the-shelf

planners has to be done by using a appropriate model-

ing language. The most prevalent one is PDDL (Aero-

nautiques et al., 1998) which uses a representation

based on ﬁrst-order logic. It is understandable for hu-

mans who use it to describe the problem by hand. It

is, however, hard to interpret such language for statis-

tical machine learning techniques that want to learn

and generalize from problem deﬁnitions.

There are many opportunities in classical planning

where machine learning can be applied. In (Asai and

Fukunaga, 2017) and (Asai and Fukunaga, 2018), the

authors focus on creating a latent representation of

an image that represents the problem and generate its

PDDL representation.

There are also works that focus on improving

the classical planning search algorithms by using a

learned policy which tells the search next action to

use. On of them in (Toyer et al., 2020), where au-

thors use the PDDL representation directly to create

a neural network architecture capable of processing

any problem from a given problem domain. Authors

in (Groshev et al., 2018) use different problem repre-

sentation but also aim to learn a policy that improves

performance of a search. They learn from grid rep-

resentation of the problem that is further processed

by computer vision techniques such as convolutional

neural networks (LeCun et al., 1989) and attention

(Vaswani et al., 2017).

Another direction is learning a heuristic function

which can be used to guide any search algorithms to-

wards a solution. Heuristic function takes a state on

the input and returns a number which represents dis-

tance from said state to a goal state. On approach that

learns a heuristic is presented in (Shen et al., 2020)

where authors created a graph neural network which

is able to process any given problem regardless of its

size or domain. Less general approach is presented

in (Urbanovsk

a and Komenda, 2021) where authors

use different types of neural network architectures to

train neural network that returns heuristic values. This

Urbanovská, M. and Komenda, A.

Grid Representation in Neural Networks for Automated Planning.

DOI: 10.5220/0010918500003116

In Proceedings of the 14th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2022) - Volume 3, pages 871-880

ISBN: 978-989-758-547-0; ISSN: 2184-433X

871

work uses grid representation of the problems such as

(Groshev et al., 2018).

In this work, we propose three novel representa-

tions of planning problems on a 2D and 3D grid which

can be utilized to train neural networks. We analyze

widely used planning benchmarks for domains that

are on a grid by default and use them as examples

for the grid representation. We also look at other do-

mains which are not on a grid by default but can be

transformed into a grid representation which supports

domain-independence of the proposed problem repre-

sentation. We also discuss the representations’ impact

on training neural networks and creating training data.

2 CLASSICAL PLANNING

BACKGROUND

Classical planning (Russell and Norvig, 2020) is a

form of general problem solving that starts in a fully

deﬁned initial state and looks for a goal state. Each

state is formed by facts representing the truths about

the modelled world. Transitions between states are

performed by applying applicable actions. The notion

of applicability builds on the inner structure of the

states composed of the facts. States together with the

transitions create a graph which represents the state-

space of a problem. By using state-space search al-

gorithms we can ﬁnd a path in such graph from the

initial state to a goal state. The path then represents

the sequence of applicable actions transforming the

initial state into one of the goal states.

State-space search algorithms can perform a blind

search and simply look through the states as they

get all explored, however in many cases the state-

space size is exponential and blind search is not good

enough to solve the problem at all or within a given

time limit. That is why most algorithms use a heuris-

tic function to guide the search and perform more in-

formed decisions when expanding the states.

2.1 Domain-independence

This works focuses mainly on the area of domain-

independent automated planning. Domain-

independent planning techniques can be applied

to any correctly formulated problem (for example

in PDDL) and they only use information about the

problem and its domain that are provided in the

standardized format.

Domain-independence is a very hard property to

achieve for neural networks as stated in (Geffner,

2018). Neural networks typically require ﬁxed in-

put size and they tend to have lots of parameters that

have to be tuned. That presents a challenge in scal-

ing because every problem instance can have different

size but it also proposes a problem because changes

in parametrization might not have obvious results on

small data but can appear as a problem later when we

want the network to scale.

This challenge was tackled with HGNN in (Shen

et al., 2020) where authors use graph neural networks

which can scale on arbitrary large problems in arbi-

trary domains. The trade-off in this case is the size

and computational time of the architecture.

Similarly in (Groshev et al., 2018) the authors can

tackle any size of the Sokoban puzzle thanks to using

convolutional neural networks (LeCun et al., 1989)

but they are mostly focused on the Sokoban domain.

2.2 Problem Representation

We mentioned the PDDL language which is a widely

used language based on ﬁrst-order logic that provides

the domain-independent planners with information

about the problem we aim to solve. It has to provide

two ﬁles; one contains the domain deﬁnition which

describes the world and its possible situations and ac-

tions and the second one contains speciﬁc deﬁnition

of the problem instance we want to solve.

Usage of a universal language such as PDDL al-

lows any planner to solve any problem but the bottle-

neck is often formulation of the problem since it has

to be created by a human. For humans, real-world

problems can be very hard and costly to formulate.

And once the problem is formulated there is no way

to check its completeness.

The same view can be taken from the other side.

In (Franc

es et al., 2019) the authors mentioned that

creating a grid representation of a problem can be

very costly and using the already formulated PDDL

is a better option. However, that holds only for prob-

lems that are already correctly formulated or prob-

lems where their formulation is tractable. There are

problem domains which are, for example, on a grid

by default and in that case the PDDL formulation does

not bring as much simpliﬁcation.

In real-world scenarios such as forklift ﬂeet, dis-

tribution warehouse or assembly line it is very costly

to have a human create PDDL representation of the

problem. Even though these complex problem would

beneﬁt from using classical planning techniques. In-

stead of creating the PDDL deﬁnitions we could take

camera footage from the said warehouse and use im-

age processing techniques to create a grid represen-

tation that approximates the real-world noisy input.

Such approach would be less costly to create neces-

sary inputs than formulating the problems by hand.

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

872

For real-world problems, the grid representation

can be seen as a latent space which allow us to use

classical planning techniques on the approximated

problem together with neural networks. There are

different types of input which can be processed by

a neural network for example vector representation.

However, size of the problem would scale with size of

the vector. Neural networks typically need ﬁxed sized

input so using fully connected neural networks with

vector representation does not scale with the problem

size.

Another representation is graph representation

that is also used in (Shen et al., 2020). Graphs are

an intuitive way of modelling state-space of the prob-

lem. However, the state-space can be exponential and

using graph neural networks to process exponentially

large graphs is possible but very costly.

3 GRID REPRESENTATION

The set of problems from the International Plan-

ning Competition (IPC) contains classical planning

domains and problems which are typically used as

benchmarks to compare different planning algorithms

and heuristics (McDermott, 2000).

Since there are many architectures that use im-

age, grid or other 2D representation of the planning

problems, we looked at widely used IPC domains and

checked whether this property can be used as a beneﬁt

for learning on planning problems.

The domains were provided via API of plan-

ning.domains

and we looked at the All-IPC STRIPS

set of problems which contains aggregation of the

planning domains over the years. From these domains

we aimed to collect

• set of domains which are on a grid by default

• set of domains which can be transformed into a

grid representation in an intuitive way

3.1 Grid Domains

Domains which are represented by a grid always con-

tain its deﬁnition in the problem deﬁnition ﬁle. We

found six domains which theoretically could be pro-

cessed by a suitable neural network architecture.

Problem deﬁnition that deﬁnes a grid structure

contains all its coordinates and also deﬁnes connec-

tions between them. Typically there are variables

where i ∈ {1,...,N} and y

where j ∈ {1,...,M}

which deﬁne the grid coordinates. Therefore we can

http://api.planning.domains/

simply construct a grid of dimensions N × M

which will correspond to the problem instance.

Next step is ﬁlling the grid with deﬁned objects

from the problem instance. Typically there is a rela-

tion at(ob ject,x,y) which deﬁnes position of an ob-

ject on the grid. By collecting these propositions from

the problem we can create the initial state.

We found six IPC domains that can be constructed

this way. We used problem descriptions from edi-

tor.planning.domains

. In Figure 1 we can see the

grid representation for each of the above mentioned

domains. All the examples are instances from the IPC

domains data set.

• ﬂoortile. A set of robots use different colors to

paint ﬂoor tiles. They can move in four directions

and each one of them can use any color. Each

robot can paint only tile that is in front (up) and

behind (down) them. Once a tile has been painted

the robot cannot stand on it.

• pegsol. This domains simulates the Peg Solitaire

single-player board game. Game board at the start

of the game contains places where pegs can go

and one empty place. Valid move is jumping one

peg over another into an empty space. The goal

of the game is to end up with one peg at a deﬁned

position.

• sokoban. In this puzzle game, the player has to

push boxes around a warehouse trying to get them

into storage locations. The player moves in four

directions and a box can be pushed only if there

is a clear space behind it. The goals is to store all

boxes.

• tetris. This domains is a simpliﬁed version of the

game Tetris. There are three types of pieces (1 ×

1,2 × 1, L) which are randomly distributed on a

grid. The goal is to free the upper half of the grid.

• tidybot. This domains models a house cleaning

task, in which one or more robots have to pick

up objects and put them to goal locations. The

world is structured as a 2D grid with navigable

locations and surfaces on which objects may lie.

Every robot has a gripper which can hold only one

object but the robots may also use a cart which can

carry more objects at the same time.

• visitall. An agent in the middle of a square grid

must visit all grid tiles. It can move in four direc-

tions.

Another related type of domains can be intuitively

represented by a grid with minor modiﬁcations to the

problem’s representation. For example logistics prob-

lem, where we have cities with different locations,

http://editor.planning.domains/

Grid Representation in Neural Networks for Automated Planning

873

trucks and airplanes and we aim to distribute pack-

ages using these vehicles to different locations. Such

scenario could be potentially projected on a grid us-

ing number of block between locations as an enumer-

ated distance function. Another domain which is a

good example is blocks world where we have a table

with different number of blocks and a robotic arm that

has to stack and unstack the blocks to achieve a given

conﬁguration. Such modiﬁcations gave us six more

domains that could be represented by a grid.

• blocks. There are blocks placed on a table.

Robotic hand can move one block at a time and

has to stack them in desired conﬁguration.

• depot. This domain combines the logistics and

blocks. domain. Trucks transport crates which

have to be stacked onto pallets in their destination.

• elevators. There is a building with N + 1 ﬂoors,

numbered from 0 to N. The building can be sep-

arated in blocks of size M + 1, where M divides

N. Adjacent blocks have a common ﬂoor. The

building has K fast (accelerating) elevators that

stop only in ﬂoors that are multiple of M/2 (so

M has to be an even number). Each fast elevator

has a capacity of X persons. Furthermore, within

each block, there are L slow elevators, that stop at

every ﬂoor of the block. Each slow elevator has a

capacity of Y .

• logistics There are several cities containing sev-

eral locations some of which are airports. There

are trucks which can drive within a city and air-

planes which can ﬂy between airports. The goal

is to get packages from various locations to new

locations.

• parking. This domain involves parking cars on

a street with N curb locations and where cars can

be double-parked but not triple-parked. Goal is

to ﬁnd a plan to move from one conﬁguration of

parked cars to another.

• storage. Hoists move and store crates of good

from containers to depots with spatial maps.

The internal structure and actions of these do-

mains suggest that representing them on a grid is pos-

sible. In Figure 2 we show such representation for

each of these domains.

In the following sections, we will be focusing

mainly on domains that are represented by a grid by

the PDDL domain deﬁnition for simplicity.

4 PROPOSED PROBLEM

REPRESENTATIONS

In this section we would like to present three different

problem representations which can be used to process

grid-based domains with neural networks. One of the

approaches aim to stay in the 2D representation and

the others lift dimensionality of the problem repre-

sentation.

4.1 2D Projection

The representation described in Section 3 can be used

as a neural network input by itself. As we described it

can be constructed by directly parsing the PDDL and

encoding all facts (objects) into the constructed grid.

One advantage of this approach is small size of the

input because it is deﬁned only in 2D. Problem with

this representation is number of present objects.

There are two encodings we can create for this

representation

• value per object type encoding, which assigns one

value to each object type

• value per object instance encoding, which assigns

one value to one object regardless of its type.

We see a clear example in Figure 3 where we have

examples for Sokoban and Tetris domain. Sokoban

uses the value per object type encoding that assigns

one value to one object type. In this example, this

type of encoding does not omit any information as

each object can be present only at one grid tile at a

time. In the Sokoban domain in general, the value per

object type encoding works as long as the initial state

does not contain two objects as the same tile. This

situation can happen when a box starts on a goal tile.

This type of encoding does not work for the Tetris

domain as we see in Figure 3. The value per object

type encoding looses information about the location

of individual blocks. The red and green blocks be-

come indistinguishable as well as the yellow and or-

ange blocks.

A way to overcome such problem is using a value

per object instance encoding which assigns a different

value to each Tetris block regardless of its type. This

encoding does allow us to tell the blocks apart. The

problem is in the semantics of such encoding. When

we use one value for one object type the encoding is

consistent and the network can learn a meaning be-

hind the values. In case of using a different value for

each block instance the values loose their semantics in

the problem domain. One value can represent differ-

ent block types across the training data which means

that it does not have a connection to a speciﬁc block

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

874

white

black

ﬂoortile

initial state

goal state

pegsol

initial state

goal state

sokoban

initial state

goal state

tetris

initial state

goal state

tidybot

initial state

goal state

visitall

initial state

goal state

Figure 1: Visualisation of all grid domains in All-IPC STRIPS.

blocks

initial state

goal state

depot

initial state

goal state

elevators

0 122

...

initial state

goal state

logistics

initial state

goal state

parking

2 5

23 9

20 25

6 21

26 3

27 14

17 24

0 18

1 8

10 16

22 4

19 13

7 12

11 15

initial state

6 21

4 19

goal state

storage

initial state

goal state

Figure 2: Visualisation of all domains adapted to grids in All-IPC STRIPS.

Grid Representation in Neural Networks for Automated Planning

875

1 1 1 1 1

1 0 0 0 0

1 0 1 0 0

1 0 0 4 4

1 1

0 1

3 1

1 1

1 3 1 0 2

1 1 1 1 1

0 1

1 1

0 3 3 3

0 1 0 0

0 2 2 0

0 0 0 0

0 4 5 5

0 4 4 5

0 1 0 0

0 2 3 0

0 0 0 0

value per object type

value per object instance

value per object type

Sokoban

Tetris

Figure 3: Example of 2D representation encoding for a

Sokoban and Tetris instance.

type. It also goes the other way, we might represent a

1 × 1 block type by a certain value which then gains

semantics that does not have to be correct as we can

use the same value to represent a different block type

in different instance

This can also be a problem during training if we

only use domains with up to K Tetris blocks. If we

then deploy instances with K + 1 blocks or more the

trained networks may not know how to treat the added

block encoding.

4.2 3D Object Type Projection

To create a universal grid encoding mechanism we

have to overcome the problems of encoding the ob-

jects. One way of doing so is creating a multi-

dimensional representation of the grid. Transforming

the grid to 3D allows us to create one grid for each

object type which creates a representation suitable for

convolutional neural networks. We create an input of

size N × M ×C where N and M represent size of the

original 2D grid and C represents a number of object

types on the grid. Therefore, for one domain we are

able to create inputs with ﬁxed C dimension and pro-

cess them with convolutional kernels.

This representation does not eliminate the encod-

ing problem completely but it converts it to creating

an encoding in terms of one object type. Such trans-

formation can be beneﬁcial if there is an even distri-

bution of the object types. However, in case of a prob-

lem with multiple instances of one object type is does

not simplify the problem.

Note that the values in Figure 3 are only illustrative.

The values could be replaced with any other encoding such

as values from interval h0, ..., 1i.

In Figure 4 we see that representing the Sokoban

domain in 3D representation is possible but as we al-

ready stated, the 2D representation is enough infor-

mative for this particular domain. Therefore we can

just use one-hot encoding in every object type grid

and create a 3D representation with objects in each

layer encoded by 1.

The Tetris example shows that the problem that

occured in creating the 2D representation persists

even when creating a separate grid for each object

type. The one-hot encoding is not informative enough

to carry all information about the blocks. To add the

information back into the 3D representation we can

create a separate encoding for each object type grid.

By using the value per object encoding per object type

we create an encoding for every object instance in

each object grid. But since each grid only contains

one object type we do not loose semantics of the val-

ues because they are deﬁned by the object type grid

position in the input and not the value itself.

Similarly to the previous case, where we train on

maximum K instances per object type and then en-

counter K + 1 instances the network does not have

any knowledge about the meaning of K + 1 encoding.

The difference in this case is that the semantics of the

K + 1 object are clear from the grid it is presented on.

Therefore we do not expect the network to learn the

full context of the object just distinguish objects of

one type.

4.3 3D Object Instance Projection

Besides creating a grid for each object type in the

problem we can also try to look at it the other way

and create a grid for every object instance in the prob-

lem.

The construction of such representation is analo-

gous to the 3D but instead of creating a grid for every

object type we create a grid for every single object

present in the problem deﬁnition. That way we create

a complete one-hot encoding that separates deﬁnition

of every object.

In Figure 5 we can see how the encoding looks

in practice if we encode one object into one separate

grid. This encoding it purely one-hot and creates an

input that is suitable for neural networks. One draw-

back of this representation is the number of grid we

have to create because it changes with the number of

objects in the problem. If we want to reuse the archi-

tecture on different problems from the same domain

or different problems altogether, this problem has to

be tackled.

The traditional convolutional kernel would not be

usable in this case because of arbitrary number of ob-

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

876

Sokoban

1 1

1 1 1 1 1

1 1

1 1 1 1 1

1 1

walls

agent

goals

boxes

Tetris

1 1

1 1 1

1 2

1 2 2

1 1 2

1x1

2x1

one- hot encoding per object type

value per object encoding per object type

1 1 1 1 1

1 1

1 1 1 1 1

1 1

neural network input

Figure 4: Example of 3D object type projection encoding for a Sokoban and Tetris instance.

jects that results in different size of dimension C for

different problems. One solution that seems reason-

able is using multi-dimensional convolutional kernel

that would be invariant towards such issue.

We also look at the problem of semantics once

again but in terms of third dimension of the created

input data. The network cannot tell which grid be-

longs to which object type and therefore we should

supply additional information to the network in order

to train it. Another way would be creating a ﬁxed en-

coding of object types which is the same in terms of

one problem domain. We could then use the encod-

ing’s value for the corresponding layers instead of the

1 in one-hot representation. Example of such encod-

ing is displayed in Figure 6. We still keep the ”one

grid per layer” rule with the addition of using object

type encoding as additional information.

4.4 Domain-independence of the

Proposed Representations

We showed that constructing grid representation from

PDDL can be done by only parsing the PDDL grid

deﬁnition and positioning deﬁned objects in the grid.

That is a category of problem domains which is per-

fectly suitable for our proposed representations.

As we showed in Figure 2 we are also able to

create a grid representation for domains that do not

contains an explicit grid deﬁnition. Even thought the

transformation of such domains had to be done by

hand using a human intuition.

That does not imply that all existing planning do-

mains can be represented on a grid. But it shows

us that many of them can be if we create the cor-

rect parser for their PDDL. Automatizing the PDDL

to grid representation parsing by creating a parser per

domains would be beneﬁcial for training neural net-

works as every grid can be transformed into one of

our proposed representations.

5 NEURAL NETWORKS FOR

PLANNING

We proposed three novel representations which can

be used in neural network training for various types

of architectures. In this work, we do not refer to a

speciﬁc neural network architecture but rather address

the neural network training as a whole.

Neural networks typically train on large amounts

of unstructured noisy data which does not corre-

spond with learning in planning. Planning problems

are well-deﬁned, deterministic and contain no noise.

Therefore, learning on planning problems is not a typ-

ical task deﬁned in the neural network domain. Main

property of neural networks, which is important to

planning, is generalizing over unseen problems.

5.1 Data Generation

In order to generalize, a neural network typically has

to see a sufﬁcient number of training problems. In

case of planning, the IPC domains we discuss in this

Grid Representation in Neural Networks for Automated Planning

877

Sokoban

1 1 1 1 1

1 1

1 1 1 1 1

1 1

walls

agent

goal

box

Tetris

1 1

1x1 2x1

1 1 1 1 1

1 1

1 1 1 1 1

1 1

neural network input

goal

box

2x1

1 1

Figure 5: Example of 3D object instance projection encoding for a Sokoban and Tetris instance.

Tetris

3 3

1x1

2x1

neural network input

2x1

3 3

Figure 6: Example of 3D encoding alternative for Tetris

which includes information about the object type.

works usually have units or tens of problems in them.

Therefore, we cannot use these domains directly to

train any network. Training on such a small data set

can lead to problems like overﬁtting or memorization

rather then learning.

There are planning problem domains with exist-

ing generators such as the ones provided in learning

track of IPC 2008. But generators do not exist for

all mentioned IPC domains. The only solution is of-

ten creating one’s own data generator for the problem

domain.

5.1.1 Properties of Generated Data

If we want the neural network to learn properties of a

problem domain we have to supply problem instances

that are going to show all of them. Training on in-

stances that are too simple can cause worse generaliz-

ing abilities. On the other hand, training on ”difﬁcult”

instances can cause the network to overlook the basic

concepts of the domain. That lead us to deﬁnition of

a ”hard problem instance”.

We are well capable to construct problems that are

hard for most humans but in terms of machine learn-

ing this metric can be completely different. There are

ways of measuring hardness of a level of Sokoban for

example in (Bento et al., 2019) but they do not pro-

vide a general metric that can measure hardness of the

level with respect to a machine learning algorithm.

For techniques such as bootstrapping (Arfaee

et al., 2010), creating problems of increasing hardness

is crucial. Bootstrapping starts training with a set of

problems that should be ”solvable” for the neural net-

work. Then it shows the network different problems

that should be harder and if there is a problem the net-

work cannot solve it gets added to the training set.

This learning technique is very beneﬁcial because

the network starts with the ”simple” instances and

learns basic concepts and structures present in the

problem. And further on it generalizes to more dif-

ﬁcult instances while still using the base level knowl-

edge and being able to solve the simple problems.

5.2 Training

Typically, when training a neural network we need

inputs (problem instances) but also labels which are

the desired outputs we want to see from the network.

If we want to train a network that outputs heuristic

value, its output will be a single number.

Usually, when we train a network to return a value

we want to minimize the difference between the ex-

pected output y and returned value x.

loss(x,y) = |x − y|

However, in this case lowering the difference be-

tween the values does not tell us anything about the

quality of the learned heuristic. We do not care as

much for the exact value as we care for the process

behind its computation. One heuristic value also does

not bring as much information as a set of values for

different states in the state-space of one problem in-

stance.

Heuristic values put the states into perspective and

we can compare which states would be expanded

when during the search. That leads us to the ideal

situation which would be using the search algorithm

itself in the loss function.

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

878

If we had the optimal (or existing) solution to the

problem in training data we could just plug the neural

network into the search as a heuristic and evaluate dif-

ference between the expected solution and the newly

found one. That would be the best metric to tell how

well does the neural network work as a heuristic func-

tion.

6 CONCLUSION

We proposed three novel planning problem represen-

tations suitable for training neural networks that can

be created from PDDL problem deﬁnition. We an-

alyzed IPC planning domains to ﬁlter out domains

which are explicitly deﬁned on a grid which makes

them great candidates for our grid-based representa-

tions. We also picked domains which are not deﬁned

on a grid and showed that it is still possible to use

a grid representation even when no grid is deﬁned in

their PDDL.

We showed examples of all proposed representa-

tions and described their advantages and drawbacks.

The biggest advantage of problems represented by

grids is the grid structure which can be easily used

for neural network training. If we used such repre-

sentation on outputs from the available PDDL data

generators it would mean more reliable training data

for architectures that require the grid-based input.

We also discussed the domain-independence of

the representations as well as possible automatising

in terms of creating PDDL parsers that would be able

to create grid representations solely from the PDDL

ﬁles.

Finally, we shortly discussed training neural net-

works for planning. We pointed out problems like

obtaining high quality data, properties that the data

should have, hardness of the problems and choosing

correct loss function for training.

Main drawback of these representations is the lack

of action information. In order to correctly compute

heuristic or solve a planning problem, it is necessary

to understand the transition system of the problem and

how it is created. In this case, we only aim to repre-

sent the problem in a convenient manner and assume

that further information is provided as another input

to the architecture.

In the future, we would like to focus on N-

dimensional representation that would be able to add

deﬁnition of actions into the grid representation as

well. Next focus is on the PDDL to grid parsers which

could be beneﬁcial set of tools for trying out different

machine learning techniques such as the mentioned

bootstrapping.

ACKNOWLEDGEMENTS

The work of Michaela Urbanovsk

a was

supported by the OP VVV funded project

CZ.02.1.01/0.0/0.0/16019/0000765 “Research

Center for Informatics” and the work of Anton

ın

Komenda was supported by the Czech Science

Foundation (grant no. 21-33041J).

REFERENCES

Aeronautiques, C., Howe, A., Knoblock, C., McDermott,

I. D., Ram, A., Veloso, M., Weld, D., SRI, D. W., Bar-

rett, A., Christianson, D., et al. (1998). Pddl— the

planning domain deﬁnition language. Technical re-

port, Technical Report.

Arfaee, S. J., Zilles, S., and Holte, R. C. (2010). Boot-

strap learning of heuristic functions. In Felner, A. and

Sturtevant, N. R., editors, Proceedings of the Third

Annual Symposium on Combinatorial Search, SOCS

2010, Stone Mountain, Atlanta, Georgia, USA, July

8-10, 2010. AAAI Press.

Asai, M. and Fukunaga, A. (2017). Classical planning in

deep latent space: From unlabeled images to PDDL

(and back). In Besold, T. R., d’Avila Garcez, A. S.,

and Noble, I., editors, Proceedings of the Twelfth In-

ternational Workshop on Neural-Symbolic Learning

and Reasoning, NeSy 2017, London, UK, July 17-18,

2017, volume 2003 of CEUR Workshop Proceedings.

CEUR-WS.org.

Asai, M. and Fukunaga, A. (2018). Classical planning in

deep latent space: Bridging the subsymbolic-symbolic

boundary. In Thirty-Second AAAI Conference on Ar-

tiﬁcial Intelligence.

Bento, D. S., Pereira, A. G., and Lelis, L. H. S. (2019).

Procedural generation of initial states of sokoban. In

Kraus, S., editor, Proceedings of the Twenty-Eighth

International Joint Conference on Artiﬁcial Intelli-

gence, IJCAI 2019, Macao, China, August 10-16,

2019, pages 4651–4657. ijcai.org.

Franc

es, G., Corr

ea, A. B., Geissmann, C., and Pommeren-

ing, F. (2019). Generalized potential heuristics for

classical planning. In Proceedings of the Twenty-

Eighth International Joint Conference on Artiﬁcial In-

telligence, IJCAI-19, pages 5554–5561. International

Joint Conferences on Artiﬁcial Intelligence Organiza-

tion.

Geffner, H. (2018). Model-free, model-based, and general

intelligence. CoRR, abs/1806.02308.

Groshev, E., Tamar, A., Goldstein, M., Srivastava, S., and

Abbeel, P. (2018). Learning generalized reactive poli-

cies using deep neural networks. In 2018 AAAI Spring

Symposium Series.

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,

R. E., Hubbard, W., and Jackel, L. D. (1989). Back-

propagation applied to handwritten zip code recogni-

tion. Neural computation, 1(4):541–551.

Grid Representation in Neural Networks for Automated Planning

879

McDermott, D. V. (2000). The 1998 AI planning systems

competition. AI Mag., 21(2):35–55.

Russell, S. J. and Norvig, P. (2020). Artiﬁcial Intelligence:

A Modern Approach (4th Edition). Pearson.

Shen, W., Trevizan, F. W., and Thi

ebaux, S. (2020). Learn-

ing domain-independent planning heuristics with hy-

pergraph networks. In Beck, J. C., Buffet, O., Hoff-

mann, J., Karpas, E., and Sohrabi, S., editors, Pro-

ceedings of the Thirtieth International Conference on

Automated Planning and Scheduling, Nancy, France,

October 26-30, 2020, pages 574–584. AAAI Press.

Toyer, S., Thi

ebaux, S., Trevizan, F. W., and Xie, L. (2020).

Asnets: Deep learning for generalised planning. J.

Artif. Intell. Res., 68:1–68.

Urbanovsk

a, M. and Komenda, A. (2021). Neural net-

works for model-free and scale-free automated plan-

ning. Knowledge and Information Systems, pages 1–

36.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.

(2017). Attention is all you need. In Advances in

neural information processing systems, pages 5998–

6008.

ICAART 2022 - 14th International Conference on Agents and Artiﬁcial Intelligence

880