在本章中,我们将深入了解并讨论图形数据库的实现,展示它们与其他存储和查询复杂、结构多变、连接紧密的数据的方式有何不同。虽然没有一个统一的体系结构模式存在,甚至在图形数据库中也不存在,但这一章是正确的
描述您期望在图形数据库中找到的最常见的体系结构模式和组件。
我们将在本章中使用Neo4j graph数据库进行讨论,原因有几个。Neo4j是一个具有本机处理能力和本机图形存储的图形数据库(参见第1章对本机图形处理和存储的讨论)。
除了在写作时是最常用的图形数据库,它还有开源的透明性优势,让读者更容易深入到一个层次并检查代码。最后,这是一个作者很熟悉的数据库。
In this chapter, we peek under the hood and discuss the implementation of graph databases, showing how they differ from other means of storing and querying com‐plex, variably-structured, densely connected data. Although it is true that no single universal architecture pattern exists, even among graph databases, this chapter describes the most common architecture patterns and components you can expect to find inside a graph database. We illustrate the discussion in this chapter using the Neo4j graph database, for several reasons. Neo4j is a graph database with native processing capabilities as well as native graph storage (see Chapter 1 for a discussion of native graph processing and storage). In addition to being the most common graph database in use at the time of writing, it has the transparency advantage of being open source, making it easy for the adven‐turesome reader to go a level deeper and inspect the code. Finally, it is a database theauthors know well.
在本书中,我们已经多次讨论了属性图模型。到目前为止,您应该已经熟悉了它的节点概念,即通过命名和定向关系连接的节点,节点和关系都可以作为属性的容器。尽管模型本身在图数据库实现中是相当一致的,但是有许多方法可以在数据库引擎的主内存中编码和表示图。在许多不同的引擎架构中,我们说图形数据库具有本地处理能力,如果它显示了一个被称为的属性索引自由邻接.
We’ve discussed the property graph model several times throughout this book. By now you should be familiar with its notion of nodes connected by way of named and directed relationships, with both the nodes and relationships serving as containers for properties. Although the model itself is reasonably consistent across graph database implementations, there are numerous ways to encode and represent the graph in the database engine’s main memory. Of the many different engine architectures, we say that a graph database has native processing capabilities if it exhibits a property called index-free adjacency
一个利用无索引邻接的数据库引擎是一个每个节点都直接引用其邻接节点的引擎。因此,每个节点充当其他邻近节点的微索引,这比使用全局索引便宜得多。这意味着查询时间与图的总大小无关,而只是与搜索的图的数量成比例。
相比之下,非原生的图形数据库引擎使用(全局)索引将节点连接在一起,如图6-1所示。这些索引在每条线上增加了一层间接关系,从而产生了更大的计算成本。原生图处理的支持者认为,无索引邻接对于快速、高效的图遍历至关重要。
A database engine that utilizes index-free adjacency is one in which each node main‐tains direct references to its adjacent nodes. Each node, therefore, acts as a microindex of other nearby nodes, which is much cheaper than using global indexes. It means that query times are independent of the total size of the graph, and are instead simply proportional to the amount of the graph searched. A nonnative graph database engine, in contrast, uses (global) indexes to link nodes together, as shown in Figure 6-1. These indexes add a layer of indirection to each tra‐versal, thereby incurring greater computational cost. Proponents for native graph processing argue that index-free adjacency is crucial for fast, efficient graph traver‐sals.
展示图形处理的非本机方法如何工作。找到阿里量我们首先要执行一个索引查找,代价是O(log n).这对于偶然的或浅层的查找来说是可以接受的,但是当我们反转遍历的方向时,代价很快就会变得昂贵起来。如果我们想要找出谁是Alice的朋友,而不是找到Alice的朋友,我们将不得不执行多个操作索引查找,针对每个可能是Alice朋友的节点。这使得成本更加繁重。而找出Alice的朋友是谁的代价是O(log n),而找出Alice的朋友是谁的代价是O(m log n)。
shows how a nonnative approach to graph processing works. To find Ali‐ ce’s friends we have first to perform an index lookup, at cost O(log n). This may be acceptable for occasional or shallow lookups, but it quickly becomes expensive when we reverse the direction of the traversal. If, instead of finding Alice’s friends, we wanted to find out who is friends with Alice, we would have to perform multiple index lookups, one for each node that is potentially friends with Alice. This makes the cost far more onerous. Whereas it’s O(log n) cost to find out who are Alice’s friends,it’s O(m log n) to find out who is friends with Alice.
使用无索引邻接,双向连接可以有效地预先计算并作为关系存储在数据库中。相反,当使用索引来模拟记录之间的连接时,数据库中并不存在实际的关系。由此产生了两个问题:
首先,在算法上,使用全局索引查找通常比遍历物理关系昂贵得多。索引通常在时间上花费O(log(n)),而——至少在neo4j中——遍历一个关系需要O(1)时间。在理论上,
即使是适中的n值,对数成本也可能比常量时间贵很多倍。在实践中,由于图及其全局索引争夺缓存和I/O等资源(例如,当索引和图数据之间发生页面争用时),性能可能会更差。
其次,当我们试图从建立索引的方向“相反”遍历时,使用索引来模拟连接就会出现问题。
现在,我们面临的选择是为每个遍历场景创建反向查找索引,还是必须对原始索引执行蛮力搜索,这是一个O(n)操作。在这种情况下,由于算法性能很差,这样的连接代价太大,对于在线系统来说没有任何实际用途。
With index-free adjacency, bidirectional joins are effectively precomputed and stored in the database as relationships. In contrast, when using indexes to simulate connec‐tions between records, there is no actual relationship stored in the database. From this, two problems arise: Firstly, using a global index lookup is typically far more expensive algorithmically than traversing a physical relationship. Indexes typically cost O(log(n)) in time, whereas—at least in Neo4j—traversing a relationship is O(1) in time. In theory, for even modest values of n, the logarithmic costs can be many times more expensive than constant time. In practice, the performance can be even worse, as a result of the graph and its global indexes contending for resources like caches and I/O (e.g., when page contention occurs between index and graph data). Secondly, using indexes to simulate connections becomes problematic when we try to traverse in the “opposite” direction from the one for which the index was constructed. Now we are faced with the choice of creating a reverse-lookup index for each traversal scenario, or we have to perform a brute-force search through the original index,which is an O(n) operation. Given the poor algorithmic performance in this situation,joins like this are simply too costly to be of any practical use for online systems.
索引查询可以用于小型网络,如图6-1所示的网络,但对于在较大的图上查询来说,代价太高了。与在查询时使用索引查找来完成关系的角色不同,具有本机图形处理功能的图形数据库使用无索引邻接来确保高性能遍历。
图6-2显示了关系如何消除索引查找的需要。
Index lookups can work for small networks, such as the one in Figure 6-1, but are far too costly for queries over larger graphs. Instead of using index lookups to fulfill the role of relationships at query time, graph databases with native graph processing capabilities use index-free adjacency to ensure high-performance traversals. Figure 6-2 shows how relationships eliminate the need for index lookups.
回想一下,在通用图数据库中,可以非常便宜地从任意方向(从头到尾或从头到尾)遍历关系。如图6-2所示,为了使用图表找到爱丽丝的朋友,我们只需以O(1)的成本跟踪她的外向朋友关系。要找到谁是Alice的朋友,我们只需跟踪Alice的所有新朋友关系到它们的来源,同样以O(1)为代价。
考虑到这些代价,很明显,至少在理论上,图遍历是非常有效的。但是,只有在为此目的而设计的架构支持下,这样的高性能遍历才能成为现实。
Recall that in a general-purpose graph database, relationships can be traversed in either direction (tail to head, or head to tail) extremely cheaply. As we see in Figure 6-2, to find Alice’s friends using a graph, we simply follow her outgoing FRIEND relationships, at O(1) cost each. To find who is friends with Alice, we simply follow all of Alice’s incoming FRIEND relationships to their source, again at O(1) cost each. Given these costs, it’s clear that, in theory at least, graph traversals can be very effi‐cient. But such high-performance traversals only become reality when they are sup‐ported by an architecture designed for that purpose.
如果无索引邻接是高性能遍历、查询和写入的关键,那么图数据库设计的一个关键方面就是图的存储方式。一种高效的本机图存储格式支持对任意图算法的极其快速的遍历——这是使用图的一个重要原因。为了便于说明,我们将以Neo4j数据库为例,说明图形数据库是如何被archi‐protectes的。
首先,让我们通过查看Neo4j的高级架构(如图6-3所示)来将我们的讨论置于上下文环境中。在接下来的内容中,我们将自底向上工作,从磁盘上的文件,通过编程api,直到Cypher查询语言。在此过程中,我们将讨论Neo4j的性能和可靠性特征,以及使Neo4j成为高性能、可靠的图形数据库的设计决策。
If index-free adjacency is the key to high-performance traversals, queries, and writes,then one key aspect of the design of a graph database is the way in which graphs are stored. An efficient, native graph storage format supports extremely rapid traversals for arbitrary graph algorithms—an important reason for using graphs. For illustrative purposes we’ll use the Neo4j database as an example of how a graph database is archi‐tected. First, let’s contextualize our discussion by looking at Neo4j’s high-level architecture,presented in Figure 6-3. In what follows we’ll work bottom-up, from the files on disk,through the programmatic APIs, and up to the Cypher query language. Along the way we’ll discuss the performance and dependability characteristics of Neo4j, and the design decisions that make Neo4j a performant, reliable graph database.
Neo4j将图数据存储在许多不同的存储文件中。每个存储文件都包含了图形特定部分的数据(例如,节点、关系、标签和属性都有单独的存储)。存储职责的划分——尤其是图结构与属性数据的分离——促进了性能图遍历,即使这意味着用户对其图的视图和磁盘上的实际记录在结构上是不同的。让我们从图6-4所示的节点结构和磁盘上的关系开始探索物理存储。
Neo4j stores graph data in a number of different store files. Each store file contains the data for a specific part of the graph (e.g., there are separate stores for nodes, relation‐ships, labels, and properties). The division of storage responsibilities—particularly the separation of graph structure from property data—facilitates performant graph traversals, even though it means the user’s view of their graph and the actual records on disk are structurally dissimilar. Let’s start our exploration of physical storage by looking at the structure of nodes and relationships on disk as shown in Figure 6-4
节点存储文件用于存储节点记录。在用户级图中创建的每个节点最终都在节点存储中,其物理文件是neostore.nodestore.db。与大多数Neo4j存储文件一样,节点存储是一个固定大小的记录存储,其中每个记录的长度为9个字节。固定大小的记录可以在存储文件中快速查找节点。如果我们有一个id为100的节点,那么我们知道它的记录从文件的900字节开始。基于这种格式,数据库可以直接计算记录的位置,成本为O(1),而不是执行搜索,后者的成本为O(log n)。
节点记录的第一个字节是正在使用的标志。这将告诉数据库该记录当前是否用于存储一个节点,或者是否可以代表一个新节点回收该记录(Neo4j的.id文件跟踪未使用的记录)。接下来的四个字节表示连接到节点的第一个关系的ID,后面的四个字节表示节点的第一个属性的ID。标签的5个字节指向这个节点的标签存储(标签可以内联到相对较少的地方)。最后一个字节是为标记保留的。其中一个这样的标志用于识别密集连接的节点,其余的空间保留给未来使用。
node记录是非常轻量级的:它实际上只是一些指向关系、标签和属性列表的指针。
The node store file stores node records. Every node created in the user-level graph ends up in the node store, the physical file for which is neostore.nodestore.db. Like most of the Neo4j store files, the node store is a fixed-size record store, where each record is nine bytes in length. Fixed-size records enable fast lookups for nodes in the store file. If we have a node with id 100, then we know its record begins 900 bytes into the file. Based on this format, the database can directly compute a record’s location, at cost O(1), rather than performing a search, which would be cost O(log n). The first byte of a node record is the in-use flag. This tells the database whether the record is currently being used to store a node, or whether it can be reclaimed on behalf of a new node (Neo4j’s .id files keep track of unused records). The next four bytes represent the ID of the first relationship connected to the node, and the follow‐ing four bytes represent the ID of the first property for the node. The five bytes for labels point to the label store for this node (labels can be inlined where there are rela‐tively few of them). The final byte extra is reserved for flags. One such flag is used to identify densely connected nodes, and the rest of the space is reserved for future use. The node record is quite lightweight: it’s really just a handful of pointers to lists of relationships, labels, and properties.
相应地,关系存储在关系存储文件neo store.relationshipstore.db中。与节点存储一样,关系存储也包含固定大小的记录。每个关系记录包含节点的id的开始和结束的关系,关系类型的指针存储在关系类型(商店),为下一个和以前的关系记录指针的开始和结束节点,和一个标志指示当前记录是否第一的通常被称为链的关系。
Correspondingly, relationships are stored in the relationship store file, neo store.relationshipstore.db. Like the node store, the relationship store also consists of fixed-sized records. Each relationship record contains the IDs of the nodes at the start and end of the relationship, a pointer to the relationship type (which is stored in the relationship type store), pointers for the next and previous relationship records for each of the start and end nodes, and a flag indicating whether the current record is the first in what’s often called the relationship chain.
在图6-5中,我们可以看到不同的存储文件是如何在磁盘上交互的。两个节点记录中的每一个都包含一个指向该节点的第一个属性和关系链中的第一个关系的指针。要读取一个节点的属性,我们从指向第一个属性的指针开始跟随单链表结构。为了找到一个节点的关系,我们跟随该节点的关系指针指向它的第一个关系(这个例子中的LIKES关系‐ship)。从这里开始,我们跟踪这个特定节点的双链表(即开始节点双链表或结束节点双链表),直到找到我们感兴趣的关系。拥有发现的记录关系我们想要的,我们可以读到关系的适当的关系(如果有)使用相同的单链表结构用于节点属性,或者我们可以检查两个节点的节点记录的关系连接使用开始节点和结束节点id。这些id乘以节点记录大小,就得到了节点存储文件中每个节点的直接偏移量。
In Figure 6-5, we see how the various store files interact on disk. Each of the two node records contains a pointer to that node’s first property and first relationship in a relationship chain. To read a node’s properties, we follow the singly linked list struc‐ture beginning with the pointer to the first property. To find a relationship for a node,we follow that node’s relationship pointer to its first relationship (the LIKES relation‐ship in this example). From here, we then follow the doubly linked list of relation‐ships for that particular node (that is, either the start node doubly linked list, or the end node doubly linked list) until we find the relationship we’re interested in. Having found the record for the relationship we want, we can read that relationship’s proper‐ties (if there are any) using the same singly linked list structure as is used for node properties, or we can examine the node records for the two nodes the relationship connects using its start node and end node IDs. These IDs, multiplied by the node record size, give the immediate offset of each node in the node store file.
使用固定大小的记录和类似指针的记录id,遍历只需在数据结构周围追逐指针即可实现,这可以以非常高的速度执行。为了遍历一个节点到另一个节点的特定关系,data - base执行了几次廉价的ID计算(这些计算比搜索全局索引要便宜得多,就像我们在非graph原生数据库中伪造一个图那样):
1. 在给定的节点记录中,通过计算其在关系存储中的偏移量(即将其ID乘以固定的关系记录大小)来定位关系链中的第一个记录。这将使我们直接找到关系存储中的正确记录。
2. 从关系记录中,查找第二个节点字段,以找到第二个节点的ID。将该ID乘以节点记录大小,以在存储中找到正确的节点记录。
如果我们希望将遍历限制到具有特定类型的关系,我们将在关系类型存储中添加一个查找。同样,这是ID与记录大小的简单相乘,以在关系存储中找到适当的关系类型记录的偏移量。类似地,如果我们选择通过标签约束,我们引用标签存储。
除了包含图形结构的节点和关系存储之外,我们还有属性存储文件,它们以键-值对的形式持久化用户数据。回想一下,作为一个属性图数据库,Neo4j允许属性(名称-值对)附加到节点和关系上。因此,属性存储是由节点和关系记录引用的。
属性商店中的记录物理地存储在neostore中。财产tore.db文件。与节点和关系存储一样,属性记录的大小是固定的。每个属性记录由四个属性块和属性链中下一个属性的ID组成(记住,与关系链中使用的双链列表相比,属性是作为磁盘上的单链列表保存的)。每个属性占用1到4个属性块——因此,一个属性记录最多可以保存4个属性。一个属性记录保存属性类型(Neo4j允许任何原始JVM类型、字符串和JVM原类型的数组),以及一个指向属性indexfile(属性名存储在其中)的指针。对于每个属性的值,记录包含指向动态存储记录的指针或内联值。动态存储允许存储较大的属性值。有两个动态存储:一个动态字符串存储(neostore.propertystore.db.strings)和一个动态数组存储(neostore.propertystore.db.arrays)。
With fixed-sized records and pointer-like record IDs, traversals are implemented simply by chasing pointers around a data structure, which can be performed at very high speed. To traverse a particular relationship from one node to another, the data‐base performs several cheap ID computations (these computations are much cheaper than searching global indexes, as we’d have to do if faking a graph in a nongraph native database): 1. From a given node record, locate the first record in the relationship chain by computing its offset into the relationship store—that is, by multiplying its ID by the fixed relationship record size. This gets us directly to the right record in the relationship store. 2. From the relationship record, look in the second node field to find the ID of the second node. Multiply that ID by the node record size to locate the correct node record in the store. Should we wish to constrain the traversal to relationships with particular types, we’d add a lookup in the relationship type store. Again, this is a simple multiplication of ID by record size to find the offset for the appropriate relationship type record in the relationship store. Similarly if we choose to constrain by label, we reference the label store. In addition to the node and relationship stores, which contain the graph structure, we have property store files, which persist the user’s data in key-value pairs. Recall that Neo4j, being a property graph database, allows properties—name-value pairs—to be attached to both nodes and relationships. The property stores, therefore, are refer‐enced from both node and relationship records. Records in the property store are physically stored in the neostore.propertys tore.db file. As with the node and relationship stores, property records are of a fixed size. Each property record consists of four property blocks and the ID of the next property in the property chain (remember, properties are held as a singly linked list on disk as compared to the doubly linked list used in relationship chains). Each prop‐erty occupies between one and four property blocks—a property record can, there‐fore, hold a maximum of four properties. A property record holds the property type (Neo4j allows any primitive JVM type, plus strings,plus arrays of the JVM primitivetypes), and a pointer to the property indexfile (neostore.propertystore.db.index),which is where the property name is stored. For each property’s value, the record contains either a pointer into a dynamic store record or an inlined value. The dynamic stores allow for storing large property values. There are two dynamic stores:a dynamic string store (neostore.propertystore.db.strings) and a dynamic array store (neostore.propertystore.db.arrays).