About

The memory module contains an ASCII compression algorithm created by Becky Pippen.
This project is the database version of the intercom. If you are looking for the Intercom version of the these scripts, please try that link.

Changelog

30 January 2012

As a result of discussions with Benjamin Bowenford, I have fixed the last script and updated the set-up procedures.

28 August 2011

Changed GET/PUT to FETCH and REPLACE to be in theme with the database.
Fixed data replication.
Changed URL replication scheme.
Forked the ADDRESS REPLICATOR into two parts; it seems that the compression algorithm takes a long time and makes the primitive unresponsive.
Lots of fixes and tests done on the code.

27 August 2011

Updated the database processor because of raptor jumps which messed up replication entirely.

27 August 2011

Added scripts.

26 August 2011

Started article.

News

We managed to cross-link the distributed primitive database between the OSGrid and Linden Lab's SecondLife by using an altered version of the scripts. The Intercom script was adapted as well in order to be able to cross-chat between the OSGrid and Second Life. It seems that this technique will not only work cross-region, however it will also work cross-grid.

Motivation

These series of scripts will help you set-up a database based on a simple syntax that will allow you to maintain a database within Second Life and without using any external databases. The advantage is that it will cut down on the cost of external databases to maintain parameters which may be reset over a region restart.

Distributed Systems and Databases

Since we cannot achieve true persistent data storage in SL, we cannot rely on a script to hold the same parameters after a sim restart or when the primitive containing it is de-rezzed. By broadcasting and replicating between primitives placed in different regions grid-wide we reduce the likelihood that all the primitives containing the replicated data will all go down at the same time due to a region restart.

Each primitive containing the data represents a node in a star-shaped network where all the nodes are connected to all other nodes. By doing that we ensure that the distributed system based on the DPD has no single point of failure. That means, that it is unlikely that all the primitives placed in different regions will lose data simultaneously. Whenever a node, represented by a primitive in the network, goes down, it loses its data. However, every DPD node temporarily commits a few URLs to invisible text storage so that when it resets it kickstarts by reading those URLs and attempts to join the network.

For effectiveness one would need a minimum of 2 (two) primitives placed in different regions. This would ensure that there is at least a two-way fallback replication between the two nodes. Thus, the danger that the stored data will be lost decreases proportionally to the number of DPD primitives on the grid.

Concerning grid-wide rolling restarts, there is an important delay before the restart wave traverses the entire grid. By having a primitive before the restart shockwave and a primitive after the restart wave we could ensure that the data is grabbed and replicated to at least a node before the wave hits the regions containing the other nodes. Given some persistent storage to kickstart a restarted node, the data would replicate back to the restarted node after the restart shockwave decouples it from the DPD network.

One could maintain a database within SL this way. When the data to be distributed is updated, it would have to be replicated to all nodes and depending on a time interval which could be a slow or a fast replication up to the limit of being at most once per second. When the data is to be pulled from the database, only one single reply would be necessary and would not generate an all-star node traffic as data changes would generate. The data can also be pulled off the URLs directly without needing to use the DPD script; for example, an external program accessing one URL of the DPD network.

Network Design

The script requests an URL whenever it is reset or the region changes. It then listens for messages on that URL. When another DPD node connects to its URL, the scripts will start replicating its own URLs to the newly connected DPD as well as add the newly connected DPD URL to its own pool.

More precisely formulated:

Following the same algorithm, and having several primitives, say N primitives, named for example's sake, PRIMITIVE_1, PRIMITIVE_2 to PRIMITIVE_N in N different regions, even if a PRIMITIVE_X would go down because of a restart, when that primitive restarts comes back up, it would be sufficient to add the new URL, URL_N of PRIMITIVE_N to ANY primitive in the chain PRIMITIVE_1, PRIMITIVE_2 to PRIMITIVE_N-1, so that after a while PRIMITIVE_X will have obtained the full list of primitive URLs of all the other primitives PRIMITIVE_1, PRIMITIVE_2 to PRIMITIVE_N in the list as well as replicating its own URL_X to all other primitives in the chain PRIMITIVE_1, PRIMITIVE_2 to PRIMITIVE_N-1.

For example, given two primitives:

Suppose that a primitive PRIMITIVE_1 containing this script has an URL of the form http://URL_1.
Suppose that another primitive PRIMITIVE_2 in a different region has an URL of the form http://URL_2.
When you add http://URL_2 to PRIMITIVE_1, then the URL http://URL_2 will register with PRIMTIVE_1.
After http://URL_2 is registered with PRIMTIVE_1, PRIMITIVE_1 will start sending its list of URLs to PRIMITIVE_2. This will have the effect, that PRIMITIVE_2 will also obtain the URL of PRIMITIVE_1.

Every database primitive (DPD node), maintains two maps between three lists, a key list to a value list and a time stamp list. Every key from the key list maps to exactly one value in the codomain of the values list and exactly one value in the domain of the timestamp list. Whenever a key is added to the key list, its value is inserted in the list of values along with a timestamp that is placed in the list of timestamps. When the DPD nodes communicate, the key-value mapping is updated based on their corresponding timestamp list by replacing the key-value map by the most recent timestamp.

The database model follows a decentralized peer-to-peer network mode of operation where each client and server are interchangeable and contribute to the network. However, for brevity, the clients only expand the network whereas the servers both extend the network and additionally propagate the data. Synchronization and data precedence is attained by the exchange of timestamps since it does not vary and does not decrease in time.

Quick Set-Up

For a demonstration, use the following steps to set-up a two-node DPD network with one client:

Create two primitives and change their name to DB1 and DB2 respectively.
Go the the "Server Scripts" section of this article and copy the scripts DATABASE ADDRESS REPLICATOR, DATABASE DATA REPLICATOR, DATABASE KICKSTART MODULE and DATABASE PROCESSOR.
Drop these four scripts in both primitives DB1 and DB2.
Click either primitive DB1 or DB2 and select [ My URL ] from the dialog menu and copy the address it will tell you on the main chat.
Click the other primitive and select [ Add URL ] and follow the instructions on the main chat to add the URL you got at step 4.
Create a third primitive and name it CLIENT.
Go to the "Client Scripts" section of this article and copy the scripts DATABASE CLIENT, DATABASE TEST MODULE as well as the DATABASE PROCESSOR and the DATABASE KICKSTART MODULE from the "Server Scripts" section.
Drop the three scripts from point 7. into the third primitive named CLIENT.
Touch the CLIENT primitive and select [ Add URL ] and follow the instructions on the main chat to add the URL you got at step 4.
Wait for the two databases DB1 and DB2 and the client to hook up. You can watch this happening by clicking either the DB1 primitive or the DB2 primitive and selecting [ List URLs ]. After a while, both DB1 and DB2 should return the same list of URLs: one URL for DB1, one URL for DB2 and one URL for the client you attached in step 9.

If you have reached this far, you can now type some database commands directly on the main chat. For example, here is a transcript of me trying to get the value for the key coffee:

Flax [Morgan LeFay]: @db_FETCH=coffee
Client: DB answer: NA

which indicates that there is no such key (NA stands here for not available) called coffee in the database. In that case, we add a new key called coffee to the database; here is the transcript:

Flax [Morgan LeFay]: @db_REPLACE=coffee:sorry, i don't drink coffee

and now, we wait a little and query the database again to get the value of the key coffee; here is the transcript:

Flax [Morgan LeFay]: @db_FETCH=coffee
Client: DB answer: NA

that does not sound good, what happened here? When we tried to retrieve the value for the key coffee in the database, the database client asked a DPD node which did not yet have the new data replicated to it yet. We give it some more time and then ask again:

Flax [Morgan LeFay]: @db_FETCH=coffee
Client: DB answer: sorry, i don't drink coffee

there we go.

Server Configuration

In a typical setup, a DPD server will contain the following scripts:

database address replicator
database data replicator
database kickstart module
database processor

Client Configuration

A typical client would consist in a primitive containing the following scripts:

database client
database kickstart module
database test module

The database client script is the script that will listen for link messages and relay them to the DPD network. The database test module script is optional and used here just to explain how developers could couple their own scripts to the database client.

Table of Contents