Example query

SELECT c.tstamp [target list]
 FROM commits c, actions a [range table]
WHERE a.file IN [qualifier]
 (SELECT id FROM files [IN-clause subquery]
 WHERE path = ’...’)
AND a.commit_id = c.id [join predicate]
ORDER BY c.tstamp DESC [sort order]
LIMIT 1; [limit expression]

Advertisements
Posted in Uncategorized | Leave a comment

PostgreSQL Planner

Query Plan:

a tree of plan nodes; each node is a single operation (join, disk scan, sort, etc.)

Planner: generate path, then generate Query Plans.

  • Planning outer joins
  • Planning subqueries
  • Planning expression subqueries
  • Planning functions
  • Planning set operations

Plan node:

  • What does the plan node look like?

Path

Cost

Join Method:

  • Nested Loop
  • Hash Join
  • Merge Join

System R algorithm

Posted in Uncategorized | Leave a comment

PostgreSQL references

http://www.postgresql.org/docs/8.4/static/biblio.html

Query Planner

http://www.postgresql.org/docs/9.1/static/geqo-pg-intro.html

Bruce MomJian

http://momjian.us/main/presentations/internals.html

Postgres-XL

Posted in Uncategorized | Leave a comment

Postgres-XL Installation Example on Linux

1. Overview

In this example, we use pgxc_ctl to configure a Postgres-XL cluster for test only.
The cluster includes 4 nodes, 1 GTM, 1 Coordinator and 2 Datanodes.
Each node is deployed on a separate host so there are totally 4 hosts used.

It’s also possible to deploy all the nodes on one host.
To do that, you just need to change all the IPs to the same one.

GTM:
hostname=host1
nodename=gtm
IP=192.168.187.130
port=6666

Coordinator:
hostname=host2
nodename=coord1
IP=192.168.187.131
pooler_port=6668,port=5432

Datanode1:
hostname=host3
nodename=datanode1
IP=192.168.187.132
pooler_port=6669, port=15432

Datanode2:
hostname=host4
nodename=datanode2
IP=192.168.187.133
pooler_port=6670, port=15433
2. Install from source code

Use git to clone the source code, see https://sourceforge.net/projects/postgres-xl.
On every host, create an account named ‘postgres’.
Use the postgres account to download the Postgres-XL source, then make.
Use the root account to install.

2.1 Install Postgres-XL on all hosts

Download source code and enter the source code directoy, e.g. /home/postgres/postgres-xl.

Then run following,
cd /home/postgres/postgres-xl
./configure
make

Install the Postgres-XL as user root,
make install

By default, Postgres-XL is installed in directory /usr/local/pgsql/.

2.2 Install pgxc_ctl on host1 only

As postgres account, enter the pgxc_ctl source directory,e.g. /home/postgres/postgres-xl/contrib/pgxc_ctl, build ptxc_ctl.
cd /home/postgres/postgres-xl/contrib/pgxc_ctl
make

Then install pgxc_ctl using root,
make install

2.3 Add /usr/local/pgsql/bin to PATH for postgres

On every machine, edit file /home/postgres/.bashrc, add a line as following,
export PATH=/usr/local/pgsql/bin:$PATH

3 Configure the firewall
Note, if all nodes are installed on the same host, this section can be skipped.

On each host, configure the firewall to accept incomming connections for the TCP port numbers.
If the cluster is only for testing, you can just turn off the firewall.

4. Configure ssh authentication to avoid inputing password for pgxc_ctl

Use postgres to do all following.

On host1, generate the authentication key file,
ssh-keygen -t rsa (Just press ENTER for all input values)
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

On host1, upload file authorized_keys to host2, host3 and host3, as following,
scp ~/.ssh/authorized_keys postgres@192.168.187.131:~/.ssh/
scp ~/.ssh/authorized_keys postgres@192.168.187.132:~/.ssh/
scp ~/.ssh/authorized_keys postgres@192.168.187.133:~/.ssh/

On every host, run following commands,
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys

On host1, try to connect host2, host3 and host4, make sure no password is needed,
ssh postgres@192.168.187.131
ssh postgres@192.168.187.132
ssh postgres@192.168.187.133

5. Build the cluster
Use account postgres do all following.

5.1 On host1, configure pgxc_ctl.conf

By default, pgxc_ctl uses /home/postgres/pgxc_ctl as its directory.
This directory is created when you run pgxc_ctl for the first time.

Edit /home/postgres/pgxc_ctl/pgxc_ctl.conf

#user and path
pgxcOwner=postgres
pgxcUser=$pgxcOwner
pgxcInstallDir=/usr/local/pgsql

#gtm and gtmproxy
gtmMasterDir=$HOME/pgxc/nodes/gtm
gtmMasterPort=6666
gtmMasterServer=192.168.187.130
gtmSlave=n

#gtm proxy
gtmProxy=n

#coordinator
coordMasterDir=$HOME/pgxc/nodes/coord
coordNames=(coord1)
coordPorts=(5432)
poolerPorts=(6668)
coordPgHbaEntries=(192.168.187.0/24)
coordMasterServers=(192.168.187.131)
coordMasterDirs=($coordMasterDir/coord1)
coordMaxWALsernder=0
coordMaxWALSenders=($coordMaxWALsernder)
coordSlave=n
coordSpecificExtraConfig=(none none none)
coordSpecificExtraPgHba=(none none none)

#datanode
datanodeNames=(datanode1 datanode2)
datanodePorts=(15432 15433)
datanodePoolerPorts=(6669 6670)
datanodePgHbaEntries=(192.168.187.0/24)
datanodeMasterServers=(192.168.187.132 192.168.187.133)
datanodeMasterDir=$HOME/pgxc/nodes/dn_master
datanodeMasterDirs=($datanodeMasterDir/datanode1 $datanodeMasterDir/datanode2)
datanodeMaxWalSender=0
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
datanodeSlave=n
primaryDatanode=datanode1

5.2 Create the cluster
When /home/postgres/pgxc_ctl/pgxc_ctl.conf is ready, run following command to create the cluster,
pgxc_ctl init all

This will initialize and start all nodes of the cluster.

6 Test the cluster
Use account postgres do all following.

6.1 Create the database

On host1, start pgxc_ctl,
pgxc_ctl

then input command,
Createdb test

This will create a database named ‘test’ . After the command, a database named ‘test’ should exist on both datanode1 and datanode2.

6.2 Connect to coord1, create a table, insert data and read data

On host2, connect to coord1,

/usr/local/pgsql/bin/psql test
test=# create table contact( id int, name text, phone varchar(30)) DISTRIBUTE BY REPLICATION;
test=# insert into contact values ( 1,’tom’,’1212121′);
test=# select * from contact;

6.3 Read data on datanode1

On host3, connect to datanode1 and read data,

/usr/local/pgsql/bin/psql test -p 15432
select * from contact;

6.4 Read data on datanode2

On host4, connect to datanode2 and read data,
/usr/local/pgsql/bin/psql test -p 15433
select * from contact;

7 Stop the cluster

On host1, start pgxc_ctl,
pgxc_ctl

then input following command to stop the cluster,
stop all

Posted in PostgreSQL | Leave a comment