Techno-Scribble: Multi-tenant storage options on AWS cloud for SaaS solutions

Multi-tenancy is one of the challenges to build & deliver SaaS solutions; there are different strategies available to partition data, while complexity arises to secure data - with controlled access across tenants, ensuring data confidentiality, integrity & seamless access to SaaS consumers;

Given SaaS, by definition, has no consumer involvement to manage network, infrastructure, database & application [all managed by cloud-provider], it's a challenge for any cloud provider to t-shirt size the hardware & software components, optimize cost on the shared environment as well as plan security measures in-place with due consideration to non-functional attributes such as performance, availability & reliability - so as to win consumers' confidence & help consumers build their business on the cloud;

Partitioning models - silo, bridge & pool; where silo model has separate database instance for each tenant; bridge model has single database & separate schema for each tenant and pool model has a shared (database + schema) and partitioning by database record / row classified with tenant id; In case of a pool model, partitioning key is used to segregate access to tenant data;

Silo model tradeoffs:

applicable when customer needs to comply to strict regulatory & security requirements;
outages are limited since availability is managed at tenant level;
cross-tenant impacts are limited, hence less vulnerable to security attacks;
comparatively priced higher with other models;

Pool model tradeoffs:

one-stop monitoring across all tenants, managing & monitoring health checks, troubleshooting & problem resolution is at one place; on-boarding new clients onto the shared platform is easier;
provides operational agility to cloud service provider, while disruptions impact all consumers on-boarded to the platform;
outages are comparatively more frequent, given shared environment, more vulnerable to attacks - hence requires due diligence to introduce:

monitoring mechanisms, incident management & resolution process
intelligent / automated issue resolution (where applies)
dynamic scaling to onboard new consumers, provision capacity at runtime, when needed - de-provision when demand reduces;

managing multi-tenant data in a shared data-model is less isolated, hence consistency & integrity is a trade-off; data size & distribution also influences data management strategy in pool model;

Business requirements drives the decisions to adapt silo / pool / bridge model; those apps requiring silo model can be budgeted for, while identify those apps open to adapt shared pool model;

Multi-tenancy on Dynamo DB - Dynamo DB has no notion of database instance, being NoSQL database; instead - all tables created are global to an account in a region - hence every table should be created unique for each account;

uses eventual consistent READs, read the written data upto or after 1 SECOND - the data is written
also supports strongly consitent READs, read the written data within < 1 SECOND; Called One-second rule;
spread across different geographic locations; stored on SSD storage, hence faster;
strongly consistent read is also possible with dynamo DB;
also has the ability to conduct ACID compliant transactions;
silo model on dynamo DB requires grouping of tables associated to a specific tenant
approach to also create secure, controlled view of the tables - prevent cross-account access;
IAM policies are used to control access to dynamo DB tables;
Cloudwatch metrics can be captured at the table level, hence simplifies aggregation of client metrics;
table read & write capacity measured as IOPS - are; applied at table level, hence distinct scaling policies can be created;

With dynamo DB:

primary key & sort keys can be created, similar to an index;
secondary indexes can be created, which are local & global secondary indexes;
local secondary index will query by the partition key, sort key can be different
global secondary index does not have any dependency on partition / sort key
in dynamo db, we can use global secondary index and aggregate on a specific field, to use a feature similar to Map-reduce in cloudant
concept of sparse indexes, where index will only exists for only those JSON records where the field exists...other database records will not be indexed
Dynamo DB is best suited to store structured & consistent JSON documents;
local secondary index to be created when we create the table

global secondary indexes can be created anytime after table is created

Advanced dynamo DB offers on-demand backup & restore, operating within the same region as the source table;
another advanced dynamo DB service offering is the point-in-time recovery, it's an incremental backup feature enabled on-demand; RPO can be upto the last five minutes;

strangely auto-scaling with dynamo DB doesn't work as expected, while it scales out, scale in has a problem when workload decreases; provision on-demand instances, if required when the throughput IOPS OR number of read/writes are really unknown;

for known workloads, provision dynamo DB instances by throughput IOPS required, downside here is that if u don't enable auto scaling, you may run into the risk of database instance crash, if capacity is overloaded;

Bridge model & silo model operate similar for Dynamo DB; only difference here is table level access policies are a bit more relaxed;

With pool modeling, challenge is that the data in multi-tenant SaaS environment doesn't typically have uniform distribution; it's very common for few SaaS vendors to consume large portion of the available data footprint; hence creating consumer footprints directly can cause partition "hot spots" - which in turn impacts costs & performance of the solution; hence sub-optimal distribution of keys with increased IOPS to offset impact & distribute workload - is a design approach to resolve this;

DAX with dynamo DB - DAX - dynamo DB accelarator, fully managed highly available in-memory cache [notice it's an im-memory cache]; for performance improvement, reduces request time from milliseconds to microseconds, even under load; it fails over across multiple availability zones, it's a managed cache, with no code changes needed; something like an improvement or improvised version of cache over dynamo DB;

Dynamo DB streams - it's something like a sequence of events, ordered in FIFO fashion, retained upto 24hrs [inserts, updates, deletes]; remember streams are retained upto 24hrs only;

Global tables - replicates data across regions in the world; tables are globally distributed, multi-master tables can be created; replication latency is under ONE SECOND; dynamo DB streaming should be enabled for Global tables to replicate data across regions; Data migration services are offered - to migrate from on-premise OR relational databases into dynamo DB; variety of relational databases are supported, migration will involve ZERO DOWNTIME - meaning the source database can remain operational;

Multi-tenancy on RDS - follows natural mapping to silo, bridge & pool models;

silo model - separate database instances is created & maintained for each tenant
bridge model is achieved creating different schemas for each tenant

different tenants, using the bridge models, can run different versions of the product at a given point in time and gradually migrate schema on per tenant basis
introducing schema changes is one of the main challenges with bridge models

pool model- moving all data into shared infrastructure model; store tenant data in a single RDS instance, tenants share common tables; primary theme is trading management & provisioning complexity for agility

Multi-tenancy on Redshift - for silo model, provisioning separate Redshift clusters is required to establish well-defined boundaries each tenant; IAM policies & DB privileges can be used to control access to data in the cluster; IAM also controls overall cluster management; bridge model isn't practical for data warehouse workloads; pool model is similar to how pooled data is stored & monitored across other associated offerings;

fully managed, clustered peta byte scale data warehouse; extremely cost effective from other on-premises data warehouses such as Teradata or Netezza;
its postgreSQL compatible, with JDBC & ODBC drivers available; features parallel processing & columnar data stores which are optimized for complex queries;
redshift newly supports an option to QUERY DIRECTLY FROM S3, feature called Redshift Spectrum; concept of a data lake is to LAND UNPROCESSED DATA INTO A LARGE AREA, APPLY A FRAMEWORK AND query it;
by default, retention period is 1 day, max retention upto 35 days; default data compression is offered and 3 copies of data is maintained; original & backup on the compute nodes + backup on Amazon S3;
for disaster recovery, redshift can ASYNCHRONOUSLY replicate data across data centers OR regions; Redshift is available only within ONE AZ one availability zone;

Techno-Scribble

Amazon Web Services (AWS)

Generic Topics

Multi-tenant storage options on AWS cloud for SaaS solutions

No comments:

Post a Comment