Improving database efficiency

Improving the efficiency of database activity for Arity/Prolog32 applications may become important as your databases grow in size, particularly if they are larger than 100Mb. Three sets of predicates are described in this section which can be used to this end. First, world predicates are used to create separate name spaces. These can aid indexing, particularly in separating out data from interpreted code access. Worlds can also be used to create namespace partitions where an entire set of terms or predicates stored in the database may be inactivated as another set becomes active.

Second, physical partitions may be created which are dynamically sized but have the effect of enhancing locality of reference. Locality of reference is important when a database must be paged to disk because of its size. Terms that are frequently accessed near one another in time benefit if they are stored on the same page or small set of pages.

Third, if you are building your own index structures such as alternative implementation of B-trees, you can benefit from storing terms that will not span across more than one database page. Predicates to determine page size and the size terms are when stored in the database are provided.

All three of these features are independent and may be freely intermixed. All are used within a single database. The ability to use up to four databases ("workspaces") is described in the Saving and restoring databases and using workspaces section.

Database code and data worlds

A database may divided into separate name spaces, each of which is called a world. At least one world, the 'main' world, always exists. When you create a database or add terms to a database without specifying a specific world, the terms automatically become are stored as a part of the 'main' world. In practice, you never have to use worlds or the world management predicates if you do not wish to.

Worlds are convenient when you want to logically divide program and data name spaces. Furthermore, worlds give you the ability to "localize" predicates in that a predicate defined in one world will be treated as unique from a predicate having the same name and arity but defined in another world.

Any world can used as a code world or as a data world or both or, if not needed at some time, as neither. The current code world is the world in which the interpreter searches for clauses. The clause predicates, such as asserta/1, assertz/1, abolish/1, call/1, and retract/1 always refer to the current code world. The current data world is the world in which data manipulation can occur. The data manipulation predicates for database keys, B-trees and hash tables always refer to the current data world.

There can be only one current code world and one current data world at any one time. You switch from world to world by using either the code_world/2 or data_world/2 predicates.

A single world can be both the current code world and the current data world. If you only have one world, the 'main' world, then the code, data, and default worlds are one and the same. Through the use of worlds, you can break up a large database by creating separate worlds for groups of terms which logically and functionally go together.

Suppose, for example, you have two sets of test data for an application that you are developing. You might wish to create two data worlds for each test set and switch between the data worlds as you continue to develop and test.

All worlds are treated as separate and equal entities by Arity/Prolog32. The clause manipulation predicates, such as assertz/1 and retract/1 only search the current code world for a term. If the term is not found in the current code world, the goal fails. The database predicates, such as recorded/3 and key/2, only search the current data world for a term. If the term is not found in the current data world, the goal fails.

If you want to search for a term among a number of code or data worlds, you can write a predicate in which you provide a list of code or data worlds, each of which is made the current code or data world and is searched for the term. For example, the following predicate will search through each data world for a term until the term is found or the predicate fails.

data_search([World|_],Key,Term) :-
    data_world(_,World),
    recorded(Key,Term,_).
data_search([_|Tail],Key,Term) :-
    data_search(Tail,Key,Term).

create_world(+Name)

The create_world predicate creates a new world with the Name specified.

For example, the call

?- create_world(zargon).
yes

will create the world named zargon.

Note: you cannot add terms to or delete terms from the newly created world unless you change to that world using the dataworld or codeworld predicates.

code_world(-Old,+New)

The code_world/2 predicate unifies the name of the present code world with Old and then changes the code world to New. If the code_world/2 predicate is encountered during backtracking, then the current code world is changed back to Old.

data_world(-Old,+New)

The data_world/2 predicate unifies the name of the present data world with Old and then changes the data world to New If the data_world/2 predicate is encountered during backtracking, then the current data world is changed back to Old.

what_worlds(-Name)

The what_worlds/1 predicate returns, through backtracking, the names of the worlds that currently exist.

delete_world(+Name)

The delete_world/1 predicate deletes a world and all of its contents.

Caution: If you delete a world, the terms contained in the world are deleted. However, you cannot delete the main world. If you delete the current code or data world (if it is other than main), then main becomes the current world. However, it is poor programming practice to delete the current world and should thus be avoided.

Partitions for locality of reference

A database partition is a set of page tables set aside for the purpose of storing data that is physically separated from other data. In other words, partitions are used to enforce a kind of locality of reference beyond the heuristics that automatically are used by Arity/Prolog32 as terms are stored in the database.

Note however, that partitions grow or shrink dynamically in size within a given database and each database is still subject to the 2 Gb size limit.

A database can have up to 256 partitions, numbered from 0 to 255. Arity/Prolog32 reserves partitions 226 through 255 for internal use and for possible future expansion. Therefore user programs can use partitions 0 through 225.

Each Prolog thread has associated with it a current partition. When PrologInitThread is first called the current partition is set to 0. When a database is restored, the current partition for all active Prolog threads are set to 0.

All new keys, hash tables, B-trees, and orphans will be created in the calling thread's current partition. All data stored under an existing key, hash table, or b-tree, or recorded after an existing orphan (using for example, record_after/3), will be stored in the same partition as the existing key, hash table, B-tree, or orphan.

Two predicates are provided to manage partitions.

dbPartition(-/+OldPartition, +NewPartition)

The current partition is queried and changed with the predicate dbPartition/2. This is a get/set style of predicate. Note that dbPartition/2 is deterministic; It does not reset to the OldPartition upon backtracking (and thus behaves differently from code_world/2 and data_world/2).

refPartition(+Ref, -Partition)

The predicate refPartition/2 is used to query which partition a database ref is stored in.

Arity/Prolog32 database page and term sizing

dbsMaxSize(-Max)

Returns the maximum size of a term that can be stored on a single page in the database. This is not the maximum size of a term that can be stored because terms may span multiple pages, if necessary. dbsMaxSize/1 is useful for the creation of advanced structures such as your own B-tree-like index pages which should, in general, fit on only one page.

dbsFits(+N)

Succeeds if N is less that or equal to the value returned by dbsMaxSize/1.

Therefore, to see if a term can fit on a single page in the database call:

    ..., dbsLen(Term, Size), dbsFits(Size), ...