My Software Notepad

Friday, February 22, 2019

Creating A Secure Schema in PostgreSQL

Your SUPERUSER account may be deployed to the database as part of provisioning your instance. This is the case when you use AWS RDS. If so you manage admin rights using a different tooling (Puppet, Terraform, Ansible).

A basic security model in an idempotent parameterized script may look like this

-- ROLES WE USE FOR SCHEMA ACCESS GRANTS

--Role

DO $$

BEGIN

CREATE ROLE sisense_or_somthing_read_role;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

--Role

DO $$

BEGIN

CREATE ROLE sisense_or_somthing_readwrite_role IN ROLE sisense_or_somthing_read_role;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

--Role

DO $$

BEGIN

CREATE ROLE sisense_or_somthing_exececutor IN ROLE sisense_or_somthing_readwrite_role;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

--Role

DO $$

BEGIN

CREATE USER sisense_or_somthing_reader WITH PASSWORD :var_pwd1 IN ROLE sisense_or_somthing_read_role;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

--Role

DO $$

BEGIN

CREATE USER sisense_or_somthing_readerwriter WITH PASSWORD :var_pwd2 IN ROLE sisense_or_somthing_readwrite_role;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

--Role

DO $$

BEGIN

CREATE USER sisense_or_somthing_maintenance WITH PASSWORD :var_pwd3 IN ROLE sisense_or_somthing_exececutor;

EXCEPTION

WHEN duplicate_object THEN

RAISE NOTICE 'Role already exists. Ignoring...';

END$$;

CREATE SCHEMA IF NOT EXISTS sisense_or_somthing;

GRANT SELECT ON ALL TABLES IN SCHEMA public TO sisense_or_somthing_read_role;

GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO sisense_or_somthing_exececutor;

GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA sisense_or_somthing TO sisense_or_somthing_exececutor;

GRANT SELECT ON ALL TABLES IN SCHEMA sisense_or_somthing TO sisense_or_somthing_read_role;

GRANT INSERT ON ALL TABLES IN SCHEMA sisense_or_somthing TO sisense_or_somthing_readwrite_role;

GRANT UPDATE ON ALL TABLES IN SCHEMA sisense_or_somthing TO sisense_or_somthing_readwrite_role;

GRANT DELETE ON ALL TABLES IN SCHEMA sisense_or_somthing TO sisense_or_somthing_readwrite_role;

GRANT TRUNCATE ON ALL TABLES IN SCHEMA sisense_or_somthing TO sisense_or_somthing_readwrite_role;

A validation query or two can help

SELECT

pr.oid role_oid, pr.rolname role_name, pr.rolcanlogin, pr.rolinherit, pr_p.rolname parent_role,

m.roleid role_member_oid, m.grantor, m.admin_option,

pr_g.rolname grantor_name, pr_g.rolcanlogin, pr_g.rolinherit

FROM

pg_roles pr

left join pg_auth_members m ON m.member = pr.oid

left join pg_roles pr_g on pr_g.oid = m.grantor

left join pg_roles pr_p on pr_p.oid = m.roleid

WHERE pr.rolname in (

'kuras_or_something_reader', 'kuras_or_something_readerwriter',

'kuras_or_something_writer', 'kuras_or_something_wrtie_role', 'kuras_or_something_exececutor',

'kuras_or_something_maintenance')

SELECT grantee, string_agg(privilege_type, ', ') AS privileges

FROM information_schema.role_table_grants

WHERE table_name=‘model_results_4’

GROUP BY grantee;

And the revert migration script would be something like this.

DROP SCHEMA IF EXISTS sisense_or_somthing;

DROP USER IF EXISTS sisense_or_somthing_admin;

DROP USER IF EXISTS sisense_or_somthing_reader;

DROP USER IF EXISTS sisense_or_somthing_readerwriter;

DROP USER IF EXISTS sisense_or_somthing_maintenance;

DROP ROLE IF EXISTS sisense_or_somthing_read_role;

DROP ROLE IF EXISTS sisense_or_somthing_readwrite_role;

DROP ROLE IF EXISTS sisense_or_somthing_exececutor;

Wednesday, June 6, 2018

Data Architecture - Document Model Design

Relational modeling has drawbacks to design implementation and to access speed. One differentiator is our common language around column based design patterns. Codd's normal forms still provide some guidance for us when it comes to document model design principals. A fundamental design guideline is ease of access is sometimes aided by denormalization, but more often aided by atomicity.

Below assume a document attribute is the same as a column and that a row is synonymous with a document.

NOTE: This is consistent across most engines except PostgreSQL, which has some designs that externalize highly read JSONB document elements into individual columns in the same row as the document for old school indexing purposes pre-GIN indexing.

1NF

Each column (or attribute) has no more than one value, each row (or document) is unique.

Can identify a unique candidate key.

2NF

Each column (or attribute) has no more than one value, each row (or document) is unique.

Can identify a unique candidate key.

All attributes or columns are dependent on the candidate key.

3NF

Each column (or attribute) has no more than one value, each row (or document) is unique.

Can identify a unique candidate key.

All attributes or columns are dependent on the candidate key.

No document or row creates a transitive association

A -> B -> C where A !-> C
Driver -> Car -> Wheel Count where Driver has no Wheels.

In columns this is easy. If one property should have its own unique id in the table, it moves to a new table. This is much more complicated in a document model. Below are different ways to model this.The first three create transitive associations. I can select driver and wheels where id = 1 and it will return the transitive value “Bob has four wheels.” The fourth creates a child document, but embeds it.

{id: “1”, [“Bob”, “car”, 4]}
{id: “1”, “driver”:”Bob”, “vehicle”: “car”, “wheels”: “4”}
{id: “1”, [“driver”:”Bob”, “vehicle”: “car”, “wheels”: “4”]}
{id: “1”, “driver”:”Bob”, “car”:{id: “1”, "type“: "wheels”, "count": 4}}
{id: “1”, “driver”:”Bob”, “car”:[{id: “1”, "type“: "wheels”, "count": 4}, {id: “2”, "type“: "hubs”, "count": 4},{id: “3”, "type“: "lug nuts”, "count": 16}]}

Examples four and five are different forms of the same child document embedding. This form of normalization is faster to access and has less data access code to write. Well, it is faster as long as the sub-document relationship is bounded in its many-ness (somewhere between 1:1 and 1:5).

Areas of Concern in Embedding

When the set of possible children can balloon out the size of the document beyond reasonable access speed, it no longer represents a well defined single entity. It defines an entity relationship, one that requires multiple entities to allow for atomicity. Large heterogeneous documents are really poorly designed entity relationships hiding in an elephant costume.

Many to many relationships also require externalization of child documents.

Data with different volatility requirements (write, update, delete operations) should be considered for externalized sub documnet storage.

Sunday, November 12, 2017

NodeJs Bash Build Script

#!/bin/bash
# Each lambda is technically its own project with its own dependencies
function npmTask() {
startingDir=$(pwd)
npmCommand=$1
for nodeDir in $startingDir/lambdas/*/;
do
echo "start node dir: $nodeDir from $startingDir"
if [ -d $nodeDir ] && [ $nodeDir != "$startingDir/lambdas/spec/" ];
then
echo "start node dir: $nodeDir from $startingDir"
cd $nodeDir
npm $npmCommand
cd $startingDir
else
echo "no node modules found"
fi
done
}
npmTask "install"
npmTask "build"
npmTask "test"

Sunday, October 1, 2017

Python Map Reduce on Tuple List

I was recently asked to read some pseudocode for calculating a class average of weighted scores. I wanted to work this out in Python to demonstrate how the design decisions around data structures impact implementation. Two independent lists assume order is maintained properly in two places. However it makes the code easier to explain.

flatScores = [75,95,85,65]
weights = [0.2,0.35,0.15,0.3]

def mult(x,y): return x * y
# map(mult,flatScores,weights)
# [15.0, 33.25, 12.75, 19.5]
reduce((lambda sum,score: (sum + score) ),list(map(mult,flatScores,weights)))/len(flatScores)

#20.125

A single list of tuples allows members to include all relevant data in one place.

scores = [[75,0.2],[95,0.35],[85,0.15],[65,0.3]]

def mapper(scoreWeights): return map( lambda scoreWeight: scoreWeight[0] * scoreWeight[1] , scoreWeights )
# mapper(scores)
# [15.0, 33.25, 12.75, 19.5]
reduce(lambda sum, cur: sum + cur, mapper(scores))/len(scores)
#20.125

Thursday, May 4, 2017

Bash Function To Create Desktop Shortcuts

Shortcuts are located in three places
/usr/share/applications/intellij.desktop for all users
a folder in the home dir for a given user
~/Desktop to see the shortcut on the desktop

Format is
[Desktop Entry]
Version=13.0
Type=Application
Terminal=false
Icon[en_US]=/home/rob/.intellij-13/bin/idea.png
Name[en_US]=IntelliJ
Exec=/home/rob/.intellij-13/bin/idea.sh
Name=IntelliJ
Icon=/home/rob/.intellij-13/bin/idea.png

The function looks like this:

function add_a_shortcut() {
APP=$1
ICON=$2
EXEFILE=$3
EXEC="bash -ic \"$EXEFILE &\""
SHORTCUT="/usr/share/applications/$APP.desktop"
USERSHORTCUT="/home/vagrant/Desktop/$APP.desktop"
if [ -e "$EXEFILE" ]; then
sudo touch $SHORTCUT $USERSHORTCUT
sudo chmod 777 $SHORTCUT $USERSHORTCUT
echo "[Desktop Entry]" | tee $SHORTCUT $USERSHORTCUT
echo "Encoding=UTF-8" | tee -a $SHORTCUT $USERSHORTCUT
echo "Comment=Launch $1" | tee -a $SHORTCUT $USERSHORTCUT
echo "Type=Application" | tee -a $SHORTCUT $USERSHORTCUT
echo "Terminal=false" | tee -a $SHORTCUT $USERSHORTCUT
echo "Exec=$EXEC" | tee -a $SHORTCUT $USERSHORTCUT
echo "Name=$APP" | tee -a $SHORTCUT $USERSHORTCUT
echo "Icon=$ICON" | tee -a $SHORTCUT $USERSHORTCUT

sudo chmod 644 $SHORTCUT
sudo chown root:root $SHORTCUT
sudo chown vagrant:vagrant $USERSHORTCUT
echo "INFO: Created $SHORTCUT $USERSHORTCUT"
else
echo "ERROR: Failed to create $SHORTCUT $USERSHORTCUT"
fi
}

EXAMPLE USAGE
add_a_shortcut aggregation-designer /opt/pentaho/design-tools/aggregation-designer/aggregation-designer.app/Contents/Resources/pad.icns /opt/pentaho/design-tools/aggregation-designer/startaggregationdesigner.sh

Thursday, March 16, 2017

Why is it so hard to install VirtualBox Guest Additions on Centos

 yum update
 yum install gcc make kernel-devel bzip2
 mkdir -p /media/cdrom
 mount /dev/sr0 /media/cdrom
 sh /media/cdrom/VBoxLinuxAdditions.run

Wednesday, December 14, 2016

Notes on Slick 3

Most examples are using await and that is blocking. Why use Slick if you are going to block? Look at onComplete, onSuccess, onFailure, and promise.

** Chaining transactions looks like this in one case

db.run{ (for { customer <- customerTable.save(); address <- addressTable.save() } yield ()).transactionally}

** Returning a list of ids from a select looks like this

 (but you can also use returning keyword)

db = Database.forConfig("myactors.database")

session = db createSession;session.conn.setCatalog("mydb")
session.conn.setSchema("myschema")

val query = Tables.rawStringTest.to[List].map(_.id)
val resultSet = db.run(query.result)

**this prints the list

resultSet.onComplete(res =>   println(s"string returns $res") )

**this waits for the future to return and uses the success/failure traits

resultSet.onComplete{
  case Success(res) => res.foreach(x => db.run(Tables.rawStringTest.filter(_.id === x).delete))
  case Failure(ex) =>  println(s"error $ex")
}

**Testing is a challenge for aysync operations. Using an eventually retry block or a continuation both work.  This is with eventually.

it("ids are generated when None is passed in table calls"){
  val pk = None
  val validation = Tables.rawStringMessages.map(_.message_data)

  sut.insertStringMsg(None, "this is a fancy message", mockSuccessFunction(waiter), mockFailureFunction(waiter))
  sut.insertStringMsg(None, "this is another fancy message", mockSuccessFunction(waiter), mockFailureFunction(waiter))

  eventually(timeout(duration)) {
  val resultSet = Await.result(db.run(validation.result), duration)

resultSet.onComplete{

 case Success(res) => resultSet.head.equals(99) shouldBe true

 case Failure(ex) => fail(s"this failed $ex)

**Some ways to insert

table returning table forceInsert JsonMessage(receiptId, messageData, None)

table returning table insertOrUpdate JsonMessage(receiptId, messageData, None)
table returning table += JsonMessageInsert(receiptId, messageData)

**And any of these can return nothing

table += JsonMessageInsert(receiptId, messageData)

table insertOrUpdate JsonMessage(receiptId, messageData, None)

/** * `Await` is what is used to ensure proper handling of blocking for `Awaitable` instances. * * While occasionally useful, e.g. for testing, it is recommended that you avoid Await * when possible in favor of callbacks and combinators like onComplete and use in * for comprehensions. Await will block the thread on which it runs, and could cause * performance and deadlock issues. */

Clean up is a chore as well for integration tests. In this case I did clean up in the AfterAll block  to avoid cleaning data during concurrent test runs.

override def afterAll() {
  db = Database.forConfig("rtisystem.database")

  session = db createSession;  session.conn.setCatalog("rti")
  session.conn.setSchema("rti")

  val query = Tables.rawStringMessages.to[List].map(_.id)
  val resultSet = db.run(query.result)

  resultSet.onComplete{
    case Success(res) => res.foreach(x => db.run(Tables.rawStringMessages.filter(_.id === x).delete))
    case Failure(ex) =>  println(s"error $ex")
  }

  val jquery = Tables.jsonMessages.to[List].map(_.id)
  val jresultSet = db.run(jquery.result)

  jresultSet.onComplete{
    case Success(res) => res.foreach(x => db.run(Tables.jsonMessages.filter(_.id === x).delete))
    case Failure(ex) =>  println(s"error $ex")
  }

This is an test example where I insert two records through a DAO extension method and compare them in a select query. 

it("generates ids when None is passed instead of a long in the table call"){
  val pk = None
  val msgList: List[String] = List("this is a fancy message", "this is a fancy message also")
  val read = Tables.rawStringMessages.filter(row => row.message_data inSet msgList).map(_.id)

  sut.insertStringMsg(None, msgList.head,      mockSuccessFunction(waiter), mockFailureFunction(waiter))
  sut.insertStringMsg(None, msgList.tail.head, mockSuccessFunction(waiter), mockFailureFunction(waiter))

  eventually(timeout(duration)) {
    val resultSet = db.run(read.result)
    resultSet.onComplete {
      case Success(res) => assert(res.head != res.tail.head && res.head > 100000 && res.tail.head > 100000)
      case Failure(ex) => println(s"error $ex")
    }
  }
}

val visitorId: Int = // whatever
val locationCodes = List("loc1","loc2","loc3"...)
// your query, with bind params.
val q = for {
    v <- Visits 
    if v.visitor is visitorId.bind
    if v.location_code inSetBind locationCodes
  } yield v
// have a look at the generated query.
println(q.selectStatement)
// run the query
q.list

*** To use a db default ***

O.AutoInc 
column_name.?

Column Option[Timestamp]

table += Class(id, data, Some(timestamp))

Or a detailed version

import slick.driver.PostgresDriver.api._
import slick.lifted._
import java.sql.{Date, Timestamp}

/** A representation of the message decorated for Slick persistence
  * created_date should always be null on insert operations.
  * It is set at the database level to ensure time syncronicity
  * Id is the Twitter snowflake id. All columns NotNull unless declared as Option
  * */
class RawMessages(tag: Tag) extends Table[(String, Option[String], Timestamp)](tag, Some("rti"), "RawMessages") {
  def id = column[String]("id", O.PrimaryKey)
  def MessageString = column[Option[String]]("MessageString")
  def CreatedDate = column[Timestamp]("CreatedDate", O.SqlType("timestamp default now()"))
  def * = (id, MessageString, CreatedDate)
}

My Software Notepad

Friday, February 22, 2019

Creating A Secure Schema in PostgreSQL

Wednesday, June 6, 2018

Data Architecture - Document Model Design

1NF

2NF

3NF

Areas of Concern in Embedding

Sunday, November 12, 2017

NodeJs Bash Build Script

Sunday, October 1, 2017

Python Map Reduce on Tuple List

Thursday, May 4, 2017

Bash Function To Create Desktop Shortcuts

Thursday, March 16, 2017

Why is it so hard to install VirtualBox Guest Additions on Centos

Wednesday, December 14, 2016

Notes on Slick 3

Followers

About Me