Quick Way to Join Across Two Databases
A recent hack / sketchy 1st product draft required datasets that live in two different database to be joined together to do some analysis on. Initially thoughts of denormalization or key/value lookup crossed peoples minds but wanting a quick win to try and get the idea validated as quickly as possible, I kept on digging.
As always stackoverflow provided the start of the journey of discovery with something called dblink.
dblink to the rescue!
dblink is a hidden gem of a extension that made doing a join across two databases (weather there local or remote) a total breeze. Firstly the extension needs to be installed on the database with the following command.
1
|
|
Once added to your database using it requires three pieces of information
1 2 3 |
|
Connection String
I personally use this format of string as it allows easy access to both local and remote DBs.
postgres://PG_USER:PG_PASSWORD@HOST:PORT/DATABSE_NAME'
eg
postgres://pguser:abcd@127.0.0.1:5432/test_db'
Query String
This is query you wish to run on the remote machine. It can be any valid SQL.
1
|
|
Casting the results
When the results from the query are returned to local machine postgres doesn’t know the column types of the received result set and thus needs to be told about them
1
|
|
Full call example
So putting those three things together the final call would be
1 2 3 |
|
Allowing you to manipulate this results just as it they were from a local database table.