Pattern Matching

SCL queries follow the same structure as in Open Cypher queries. In their simplest form, these to start with a MATCH clause that specifies a pattern to match, plus a RETURN clause returning all the bound variables. In the following query, all nodes of the graph are matched and bound to variable n.

MATCH (n)
RETURN *

Statements in a query are conceptually executed one after the other (although the engine is free to execute them in any order as long as the correct query result is preserved). When the star ( * ) operator is used in a RETURN statement, all bound variable are returned.

Patterns are expressed in terms of one or more paths. A path is a succession of steps connecting nodes of the graph through edges. For instance, the following pattern consist of a single path one step that matches all pairs of connected nodes through a directed edge. In this case, the first node in the pair is bound to variable n, while the second node is bound to variable m:

MATCH (n)->(m)
RETURN *

More complex patterns can be expressed by combining multiple paths like in the following query, which uses to paths with one step to look for pairs of nodes that are reciprocally connected through a directed edge.

MATCH (n)->(m), (m)->(n)
RETURN *

The previous query, could be expressed alternatively with a single path with two steps:

MATCH (n)->(m)->(n)
RETURN *

or alternatively, by reversing the direction of the matched edges

MATCH (n)<-(m)<-(n)
RETURN *

Finally, if we are interested in also knowing the ids of the edges, we can express the pattern in the following way:

MATCH (n)-[r]->(m)-[q]->(n)
RETURN *

Node homomorphism, Edge Isomorphism

In SCL, as in Open Cypher, node matching follows the rules of subgraph homomorphism, while edges follow the rules of subgraph isomorphism. In homomorphism, two variables can be bound to the same object, while in isomorphism two variables must be bound to different objects. For instance, in the following query:

MATCH (n)-[r]->(m)-[q]->(n)
RETURN *

n and m could be bound to the same node, but r and q must be bound to different edges. This means that if the graph contains a node with two self loops (a self loop pointing to itself), the pattern would find an occurrence. This would not be true if the node had just a single self loop, because then only one edge (r or q) could be bound in the pattern.

Matching Nodes

The following query matches and returns all nodes of the graph

MATCH (n)
RETURN *

Nodes can be filtered by their type when matched. The following query, returns all the nodes of the graph of type “person”:

MATCH (n:person)
RETURN *

Finally, we can also match nodes with certain property values. For instance, the following query returns all nodes of type person whose name is “John” and are 18 years old:

MATCH (n:person { name : "John", age : 18} )
RETURN *

As described in ‘Differences with Open Cypher’, the properties matched depend on whether the type of the node (“person” in this case) is specified or not. In the query above, if there exists both a “type-specific” “name” property for “person”, and a global property called “name”, the “type-specific” one will take preference given that we know that the type of the node is “person”. However in the following query, the global property would take preference because we have not specified the type of n:

MATCH (n { name : "John"} )
RETURN *

Matching Edges

Similar to nodes, we can match edges. The following query matches and returns all the edges of the graph, regardless of the direction. This means that each edge will be matched twice, one for each orientation.

MATCH ()-[r]-()
RETURN *

The matching of edges can also be restricted by the edge type. For instance, the following query matches all the edges of type “role”:

MATCH ()-[r:role]-()
RETURN *

Finally, edges can also have properties like nodes, and can be used to restrict the matching of the pattern:

MATCH ()-[r:role {type : 'actor'}]-()
RETURN *

Edge directions

When matching edges, we can specify the direction of the match. This is particularly useful when having directed edges and we are interested on particular endpoints. For instance, the following query retrieves all the nodes that are a “tail” of an edge of type “role”:

MATCH (n)-[r:role]->()
RETURN *

Similarly, we could be interested in the nodes that are a “head” of an edge of type “role” as follows:

MATCH (n)<-[r:role]-()
RETURN *

MATCH ()-[r:role]->(n)
RETURN *

Matching Nodes and Edges

The matching of nodes and edges can be combined in a single query. For instance, the following query matches all pairs of “persons” and “movies” such that the person has exercised as an “actor” in the movie

MATCH (n:person)-[r:role {type : 'actor'}]-(m:movie)
RETURN *

Similarly, we can restrict more the pattern when looking for all the actors of a specific movie:

MATCH (n:person)-[r:role {type : 'actor'}]-(m:movie {title : 'The Thing'})
RETURN *

We can also specify matching a node or an edge without binging them to a variable. For instance, if in the previous query we are only interested in the person nodes, we can specify such query by removing the variable names as follows:

MATCH (n:person)-[:role {type : 'actor'}]-(:movie {title : 'The Thing'})
RETURN *

Complex patterns with multiple paths and MATCH clauses

We can build more complex patterns by combining multiple paths. When using multiple paths, occurrence of the same variable appearing in more than one path refers to the same matched object. For instance, the following query looks for pairs of movies where a person has participated.

MATCH (n:person)-[r1:role]-(m:movie),
      (n:person)-[r2:role]-(q:movie)
RETURN *

Note that due to the edge isomorphism rule, r1 and r2 must be different edges.

If we want to allow r1 and r2 to match the same edges, we can express the same query as follows:

MATCH (n:person)-[r1]-(m:movie),
MATCH (n:person)-[r2]-(q:movie)
RETURN *

In this case, two patterns are matched in the graph, and are combined. In other words, the edge-isomorphism rules only apply to multiple paths within the same pattern, but not between different patterns.

Returning columns with *

When returning columns with the * operator, all variables bound in the query are returned. However, the following aspects must be considered: * All variables bound in the query are returned * The order of the columns is the same as the order of occurrence of the variables in the query, reading from left to right first, and then from top to bottom. * Variables occurring more than once will only be returned once, and that of the first occurrence.

For instance, in the following query:

MATCH (n:person)-[r1:role {type : 'actor'}]-(m:movie),
MATCH (n:person)-[r2:role {type : 'director'}]-(q:movie)
RETURN *

the returned columns will be: n, r1, m, r2, q, in that order. Columns are always named after the expression name, unless an alias is specified. For more details, refer to the ‘RETURN’ clause.

Back to Index

Sparksee openCypher

by Sparsity Technologies