SlideShare a Scribd company logo
Technical Architecture
Objectives
ChainSys’ Smart Data Platform enables the business to achieve these critical needs.
1. Empower the organization to be data-driven
2. All your data management problems solved
3. World class innovation at an accessible price
Subash Chandar Elango
Chief Product Officer
ChainSys Corporation
Subash's expertise in the data management sphere
is unparalleled. As the creative & technical brain behind
ChainSys' products, no problem is too big for Subash,
and he has been part of hundreds of data projects
worldwide.
Introduction
This document describes the Technical Architecture of the
Chainsys Platform
Purpose
Scope
The purpose of this Technical Architecture is to define the technologies,
products, and techniques necessary to develop and support the system
and to ensure that the system components are compatible and comply
with the enterprise-wide standards and direction defined by the Agency.
The document's scope is to identify and explain the advantages and
risks inherent in this Technical Architecture.
This document is not intended to address the installation and configuration
details of the actual implementation. Installation and configuration details
are provided in technology guides produced during the project.
Audience
The intended audience for this document is Project Stakeholders,
technical architects, and deployment architects
Platform Component Definition
The system's overall architecture goals are to provide a highly
available, scalable, & flexible data management platform
A key Architectural goal is to leverage industry best practices
to design and develop a scalable, enterprise-wide J2EE
application and follow the industry-standard development
guidelines.
All aspects of Security must be developed and built within the
application and be based on Best Practices.
Security
User Management
Base Component
Gateway Component
Authentication /
Authorization /
Crypto
User / Groups
Roles /
Responsibility
Access Manager
Workflow
Versioning
Notification
Logging
Scheduler
Object Manager
API Gateway
Data Quality
Management
Master Data
Governance
Analytical MDM
(Customer 360,
Supplier360,
Product 360)
Data Migration
Setup Migration
Test data Prep
Big Data Ingestion
Data Archival
Data Reconciliation
Data Integration
Data Masking
Data Compliance
(PII, GDPR, CCPA,
OIOO)
Data Cataloging
Data Analytics
Data Visualizat
Used for Autonomous
Regression Testing
Used for Load and
Performance Testing
Rapid Application
Develiopment (RAD)
Framework
Visual Development
Approach
Drag & Drop
Design Tools
Functional Components
into Visual Workflow
Foundation Smart data platform Smart Business Platform
Architecture Goals
The Platform Foundation forms the base
on which the entire Platform is built.
The major components that create the
Platform are described in brief.
Security Management User Management
Base Components
Gateway Component
Users
User Groups
Object Access Manager
Responsibilities
Role Hierarchy
JWT
SAML
OAuth2.0
Federated Authentication
Platform Authentication
Credential Authentication
AD
LDAP
Authentication Service
Credential Authenticator
SSO Authenticator
Authorization Engine
Org/License Authorization
App / Node Authorization
Access Authorization
Hashing Algorithm
Asymmetric Encryption
Crypto Engine
MD5
SHA1
AES 128
AES 256
Platform API
API Gateway Engine
Login API
REST API Publisher SOAP Service Publisher
Job Feedback
Workflow
Logging
Constructs
Application Logs
Execution Logs
Audit Logs
Approvals
Activities
SLA
Collaborate Versioning
EMAIL SVN
GIT
Database
Web Notification
Chats
Platform Object Manager
Object Sharing
Scheduler
Job Schedular Job Feedback
Dependent Sharing
Sharing Manager
Platform Foundation
The component manages all the Roles,
Responsibilities, Hierarchy, Users, &
User Groups.
The Platform comes with
the preconfigured
Responsibilities for dataZap,
dataZen, and dataZense.
Organizations can customize
Responsibilities and are
assigned to the platform
objects with additional
privileges.
The Platform comes with
the predefined Roles for
dataZap, dataZen, and
dataZense. Organizations
can create their Roles.
The Role-based hierarchy is
also configured for the user
level hierarchy. The roles are
assigned with the default
responsibilities.
The users will be assigned
these applications that are
necessary for them. The
User will be given a Role.
The hierarchy formed using
the role hierarchy setup
where a manager from the
next role is assigned.
The responsibility against
these roles is set by default
for the users. The User can
be given more responsibilities
or revoked an existing
responsibility against a role.
Users gain access to the
objects based on the
privileges assigned for the
responsibility.
User Management
Responsibilities Users
Roles
SSL
Authentication Engine
The Platform is SSL / HTTPS enabled on the transport layer with TLS 1.2 support. The SSL is applied to
the nodes exposed to the users like the DMZ nodes and Web nodes and the nodes exposed to the
third-party applications like the API Gateway nodes.
The Platform offers a Credential based authentication handled by the Platform and also Single Sign-On
based federated authentication. Both SSO and Credential authentication can co-exist for an organization.
User authentication on the Platform happens with the supplied credentials. All the successful
sessions are logged, and failed attempts are tracked at the user level for locking the user account.
A user can have only one web session at any given point in time. Password policy, including expiry,
is configured at the Organization level, applicable for all users. Enforced password complexity like.
Credential Authentication
SSO can be set up with federated services like SAML, OAuth 2.0,
or JWT (Java Web Tokens). Setup for an IDP would be configured
against the organization profile, and authentication would happen
in the IDP. This can either be IDP initiated or SP
(Chainsys Smart Data Platform) initiated.
The organization users with SSO would get a different context
to login.
Single Sign-On
Min length
Max length
Usage of Numbers, Cases, and Special Characters can be set.
No of unsuccessful attempts are also configurable
Security
Management
The security management component
takes care of the following
Authorization Engine
The first level of authorization would be the Organization License. The Licensing engine would be
used to setup the organization for the authentications too
Authorization Engine
The Crypto Engine handles both asymmetric encryption and hashing algorithms
Authorization Engine
The workflow engine is created to manage the orchestration of the flow of activities.
The workflow engine is part of the platform foundation extended by the applications to add
application-specific activities.
Version Management
This component helps in handling the version of objects and records eligible for versioning.
The foundation has the API to version the objects and its records and can be extended by the
applications to add specific functionalities. Currently, the Platform supports SVN as default and
also supports database-level version management. Support for GIT is on the roadmap.
Notification Engine
The notification engine is the component that will do all the notifications to the User in the system.
The feature helps notify the users on the page when online in the application. The other notifications
like Mail notification and Chat Notification are also part of this component.
Logging Engine
All activity logs, both foundation, and application are handled to understand and help in the debugging.
AES 128 is the default encryption algorithm but also supports 256 bits
The keys are managed within the Platform at the organization level. The usage of keys maintained
at the application level determines how they are used for encryption and decryption.tv
All the internal passwords are being stored by default with MD5 hashing
Encryption of the stored data can be done at the Database layer as needed
The next level of authentication would be the Applications assigned to the Organization and the
respective User. The individual application nodes would be given to the organization as per the
service agreement to handle load balancing and high availability
Authorization of pages happens with the responsibilities assigned to the users
Authorization of a record happens concerning sharing the records to a group or individual users
Responsibility and sharing will have the respective privileges to the pages and records
On conflict, the Principle of least privilege is used to determine the access
The login service is the one that authenticates if the requested consumer has the proper authentication
or credentials to invoke the job or action. The publisher engine has two methods of authentication.
Login Service
The eligible jobs or actions can be published using the Simple Object Access Protocol (SOAP). SOAP is a
messaging protocol that allows programs that run on disparate operating systems to communicate
using Hypertext Transfer Protocol (HTTP) and its Extensible Markup Language (XML).
SOAP Service
The eligible jobs or actions can be published using the Representational State Transfer Protocol (REST).
REST communicates using the HTTP like SOAP and can have messages in multiple formats. In dataZap,
we will publish in the XML format or the JSON (JavaScript Object Notation) format.
REST Service
Inline authentication - where all the requests will have the Credential for authentication
and access control
Session Authentication - This service is explicitly invoked to get the token and gather the other
published services using this token to authorize the request.
It enables you to schedule a job once or regularly. In terms of recurring jobs are planned minutely,
hourly, weekly, and monthly.
Scheduler Creation
The scheduler execution engine uses the configuration and fires the job in the respective application.
The next job would be scheduled at the end of each job as per the configuration.
Scheduler Execution
The scheduled jobs are monitored and keep track of the progress and status at any stage. If there is
any delay in the expected job or unexpected errors, the responsible users are notified accordingly
for actions.
Job Monitoring
API
Gateway Engine
The API Gateway forms the foundation for publishing
and consuming services with the Platform.
All the eligible jobs or actions can be published for
external applications to access. The following are the
components that would form the publishing node.
dataZap
Component
The Execution Handler will be available on
the client-side and at the cloud to handle
the pure cloud environment and manipulate
data in the cloud for less Load at the client
end. The Execution controller will be available
in the cloud to direct the execution handler
in every step.
Execution Controller
End Points
Endpoints
Connector
Extract Adapter Dataflow Adapter
Platform
Endpoints
Foundation Engine
Load Adapter
Active Transformation
CDC Engine
Filter Engine
Child Iterator
Crypto Engine
Data Stream Service
Endpoint
Extract
Engine
Mapper
Foundation Engine
Lookup
Sequence
Expression
Lookup
Sequence
Expression
Pre-Load
Post Load
Data
Load
Engine
JCO
JDBC
{Rest}
{Soap}
OData
Ingestion Engine
Crypto Engine
Passive Transformation
Validation Engine
Reprocessing Engine
Reconciliation Engine
Relational Databases
Cloud Applications
Big Data Lake
No SQL Databases
Enterprise Storage
System
Message Broker
Enterprise Applications
Data
Object
Engine
Aggregator
Sorter
Unifer
Reporting Engine
Visualization API
Process Flow
Process Flow Adapter
Scheduler
Job Initiation
Exception Notification
Reconciliation Adapter
Comparator
Visualization API
API Gateway
REST API Publisher
Migration Flow
Master / Transaction Flow
Setup Migration Flow
Versioning Engine
Object Versioning
Lookup
Expression
BOTS
Playback
Builder
Objects
Joiner
Normalizer
Router
Execution Engine
Localized
Transformation
API TAPI
Compartor
Endpoint Connectors
The component has all the base connectors used to connect to most of the endpoint applications.
The base connectors would include
JDBC - For all the RDBMS connections
SAP JCo - For connecting to the SAP Systems
SOAP - Connects to applications enabled with SOAP APIs.
REST - Connects to applications enabled with REST APIs
OData - Connects to applications enables with OData APIs
FTP - To connect and extract data from files in the FTP sites.
NoSQL - To connect to databases with NoSQL like Mongo
Message Broker - To connect to different messaging
services like ActiveMQ, IBM MQ
The connections can be made secure based on the endpoint configuration by Secured Layer through
all of the above base connectors.
The specific connectors for Enterprise applications are wrappers built over these base connectors with
specific Security and governance applied as per the application needs. The diagram shows a few of the
existing wrapper endpoints created for the enterprise applications in the market.
We can use the base connectors for applications that do not have specific connectors if they do not have
any particular authentication methods other than the base level authentications provided. ChainSys
would build the applications specific connectors if not already exists.
Load Adapter
The component that handles the loading of
data into multiple systems or endpoints.
Ingestion Engine
Crypto Engine
Initial Data marting happens in this Engine and then manipulate the data.
The Crypto Engine enables us to encrypt the data during the data marting process to ensure Data is
protected in all formats. It also has the decryption to be applied before loading the data to the final
target endpoint.
Reconciliation Engine
The reconciliation engine handles the technical reconciliation of the data in two stages.
The reconciliation is done at the end of the pre-validation stage to determine the differences
between the raw data and the transformed data and further after the loading complete to decide
the differences between the loaded and the raw or changed data.
Reprocessing Engine
This component helps to correct the errored data both at the pre-validation level and the post-load level.
Data fix can be handled both online and offline. Users will be able to download the error data as an
excel and upload the corrected Data as a bulk update process. In addition to the data error correction,
we can also enhance or construct data that will pass through the validation step for quality.
Loading Engine
The loading engine is where the application understands the endpoint type and uses one of the loading
engines to load the data into the target application. The loading engine also has special adapters to use
the Playback Adapters of the Smart BOTS and Smart App Builder in the business platform.
Transformation
Passive transformation
This transformation just changes the values of the columns from one form to another.
The different transformations types like lookup transformation, sequence transformation, and
expression-based transformation can be performed.
TAPI
TAPI helps create reusable transformations (API) in multiple objects to make changes in one place
rather than in numerous areas.
Extract Adapter
The extract adapter component retrieves data
from multiple different types of endpoints and
processes the data to give the data in the
expected format.
This Engine handles almost all different kinds of systems and formats to retrieve data. It can work with
SQL / Flat files / SOAP and REST service.
Data Object Engine
This is to reduce the number of rows from the raw data extracted from the source.
Filters
This component handles the master child relationship between the data extracts so that the filter
applied on the master can get down to all the child levels.
Child Iterator
The Crypto Engine is to read or extract the data with encryption applied over the selected fields for
extraction. This helps in encrypting the data from being accessed from the front end or the back end.
Crypto Engine
Here the data extracted from the data object or the extract adapter are streamed to the applications
calling the service to pull the data from the endpoint.
Data Streaming Service
This is the component that gets the changed data from the source. Two modes can achieve this
Changed Data Capture
The recommended option is by assigning the date field to be used for bringing the changed data.
There is also an option to bring in data by comparing the records and is supposed to be resource-
intensive and is not recommended until there is no date field to compare.
This transformation where the number of rows is getting affected. The possible active transformations
available are the Normalizer, Joiner, Router, Unifier, Aggregator, and Sorter. The active transformation
engines convert the data structures from the source to the target.
Active transformation
It also has the rules engine (Router) to move the data as per the rules to different endpoints.
It also can compare the data between the two systems and determine action before moving the data.
Dataflow Adapter
Migration Flow
The dataflow adapter helps in transforming
and mapping the data from multiple sources
to multiple target systems.
This component overrides the workflow component in
the foundation. The migration flow engine is specific to the
migrating Master or Transaction Data. The feature has
orchestration capabilities and human intervention capabilities
like Approval, User Confirmation, and Receive Input.
Process flow Engine
This component overrides the workflow component in the
foundation. The process flow engine is specific to the data
movement. The feature has all the orchestration capabilities
and human intervention capabilities like Approval,
User Confirmation, and Receive Input.
Scheduler
These components provide the job's execution agents specific
to dataZap that needs to be executed by the base scheduling
engine. These are wrappers for the data movement components
like Load Adapters, Extract Adapters, Dataflow Adapters, and
Process Flow.
API Gateway
These are execution agents for the publisher in the foundation.
These form the wrapper to the jobs that need to be executed in
the data movement components like Load Adapters,
Extract Adapters, Dataflow Adapters, and Process Flow.
API Gateway
These are execution agents for the publisher in the foundation.
These form the wrapper to the jobs that need to be executed
in the data movement components like Load Adapters,
Extract Adapters, Dataflow Adapters, and Process Flow.
Reconciliation Adapter
The reconciliation adapter generates the query to compare
the data and produce the Visualization API result to create
the necessary reconciliation dashboard.
Reporting Engine
The reporting engine generates reports on the various
adapters' execution and produces dashboards to understand
the actions taken and to be taken.
11
dataZap
Component
The Execution Handler will be available on
the client-side and at the cloud to handle
the pure cloud environment and manipulate
data in the cloud for less Load at the client
end. The Execution controller will be available
in the cloud to direct the execution handler
in every step.
System Technology
Landscape
DMZ Nodes DMZ Nodes DMZ Nodes DMZ Nodes
APACHE HTTPD Server
Web Load Balancing
Reverse Proxy
Forward Proxy
single sign
On
Foundation Nodes
Default Data Stores
Caching Node
Schedular
Node
File / Log
Server
DATABASE
Metadata Store Versioning Store
Web Application
Apache
Tomcat 9
11
ACTIVE
Apache
MQ
Collaborate
Server
API
Gateway
dimple.js
R Analytics
12.16
V4
Selenium
WebDriver
Data Mart
Indexing Store
App Data Store
DATABASE
Apache
relax
Apache HTTPD
The Apache HTTPD server is used to route the calls to the Web nodes. The server also handles the load
balancing for both the Web Server Nodes and the API gateway Nodes. The following features are used in
the Apache HTTPD
Single Sign-On
This Node is built on the Spring Boot application
with Tomcat as the Servlet container.
Organizations opting to have a single sign-on would
have a separate SSO node with a particular context.
The default context will take them to the
platform-based authentication.
Highly scalable
Forward / Reverse proxy with caching
Multiple load balancing mechanisms
Fault tolerance and Failover with automatic recovery
WebSocket support with caching
Fine-grained authentication and authorization access control
Loadable Dynamic Modules like ModSecurity for WAF etc.
TLS/SSL with SNI and OCSP stapling support
This layer consists of the nodes exposed to the
users for invoking the actions throughfrontend
or a third-party application asAPI’s. The nodes
available in this layer would be theWeb Server
to render the web pages, API Gateway for other
applications to interact with the application, and
the collaborate node for notifications.
Web Nodes
DMZ Nodes
These nodes are generally the only nodes
exposed to the external world outside the
enterprise network. The two nodes in this
layer are the Apache HTTPD server and the
"Single Sign O" Node.
Apache Tomcat 9.x is used as the servlet container.
JDK 11 is the JRE used for the application. The Platform works on
OpenJDK / Azul Zulu / AWS Corretto and Oracle JDK.
Struts 1.3 platforms are used as the controllers
Integration between the webserver to the application nodes is handled with
Microservices based on the SpringBoot
The presentation layer uses HTML 5 / CSS 3 components and uses many
scripting frameworks like JQuery, d3js, etc.
The web server can be clustered to n- nodes as per the number of concurrent
users and requests.
This Node uses the service of Jetty to publish the API as SOAP or REST API.
The API Gateway can be clustered based on the number of concurrent API calls
from the external systems.
The Denial of Service (DoS) is accomplished in both JAX-WS and JAX-RS to prevent illegal attacks.
Web Server
The web application server hosts all the web pages of the chainsys platform.
Gateway Node
This Node uses all the default application services.
The notification engine uses netty APIs for sending notification from the Platform.
Apache Active MQ is used for messaging the notification from application nodes.
Collaborate
This Node is used to handle all different kinds of notifications to the users like front end notifications,
emails, push notifications (in the roadmap). This Node also has the chat services enabled that can be
used by the applications as needed
Node Node Node
(Analytical Services /
Catalog Services)
The application uses only
the default services that
are mentioned above.
The application uses only the
default services that are
mentioned above.
The application uses all the
default services that are
mentioned above.
In addition to this, it also uses
R analytics for Machine
Learning algorithms.
It also uses D3 and Dimple JS
for the visual layer.
Application Nodes
The application nodes are spring boot applications for
communicating between theother application nodes and
web servers.
Load Balancing is handled by the HAProxy based on the
number of nodes instantiated for each application.
JDK 11 is the JRE used for the application. The Platform
works on OpenJDK / AzulZulu / AWS Corretto and Oracle JDK.
The application uses all the default services
that are mentioned above. In addition to this,
it also uses the Selenium API for web-based
automation and Sikuli.
The application uses all the default services that are
mentioned above. These services are used to configure
the custom applications and to generate dynamic web
applications as configured.
The mobile applications' service would need
NodeJS 12.16, which would use the IonicFramework V4
to build the web and mobile apps for the configured
custom applications.
Data Storage Nodes
Database
Chainsys Platform supports both PostgreSQL 9.6 or higher and Oracle 11g or higher databases for both
The Platform uses PostgreSQL for the Metadata in the cloud. PostgreSQL is a highly scalable database.
Metadata of the setups and configurations of the applications
Data marting for the temporary storage of the data.
Scheduler Node
This Node uses only the default application node services.
This Node can be clustered only as failover nodes.
When the primary Node is down, the HAProxy makes the
secondary Node the primary Node
The secondary Node handles notifications, automatic
rescheduling of the jobs. It calls each of the application
objects that are schedulable to take all the possible
exception scenarios to be addressed.
Once the Node is up and running, this will become the
secondary Node.
Designed to scale vertically by running on more significant and faster servers
when you need more performance
Can be configured to do horizontal scaling, Postgres has useful streaming
replication features so you can create multiple replicas that can be used for
reading Data
It can be easily configured for High Availability based on the above.
1
2
3
Cache Server
Redis cache is used for caching the platform configuration objects and execution progress information.
This helps to avoid network latency across the database and thus increases the performance of
the application.
When the durability of Data is not needed, the in-memory nature of Redis allows it to perform well
compared to database systems that write every change to disk before considering a transaction
committed.
The component is set up as a distributed cache service to enable better performance during data access.
Redis cache can be made HA enabled clusters. Redis supports master-replica replication
Multi-tenant database architecture has been designed based on the following
Password Storage Encryption
Encryption For Specific Columns
Data Partition Encryption
Encrypting Passwords Across A Network
Encrypting Data Across A Network
SSL Host Authentication
Client-Side Encryption
Separate Databases approach for each tenant
Trusted Database connections for each tenant
Secure Database tables for each tenant
Easily extensible Custom columns
Scalability is handled on Single Tenant scaleout
PostgreSQL offers encryption
at several levels and provides
flexibility in protecting data
from disclosure due to database
server theft, unscrupulous
administrators, and insecure
networks. Encryption might
also be required to secure s
ensitive data.
Loader Adapters,
Data Objects,
Data Extracts,
Data Flows,
Process Flows,
Migrations Flows,
Reconciliations
Data Model,
Rules,
Augmentations,
Workflow
Data Set,
Views,
Dashboards,
Ad-hoc Reports
Object Model,
Layouts,
Workflow
File Log Server
This component is used for centralized logging, which handles the application logs, execution logs, and
error logs in the platform applications' common server. Log4J is used for distributed logging.
These logs can be downloaded for monitoring and auditing purposes. A small Http service gets executed,
which allows the users to download the file from this component—implemented with the Single Tenant
scaleout approach.
Subversion (SVN) Server
Apache Subversion (abbreviated as SVN) is a software versioning and revision control system distributed
as open-source under the Apache License. The Platform uses SVN to version all the metadata
configurations to revert in the same instance or move the configurations to multiple instances for
different milestones. All the applications in the Platform use the foundation APIs to version their objects
as needed.
Real-Time, Massive Read, and Write Scalability
Solr supports large-scale, distributed indexing, search, and aggregation/statistics operations, enabling it to
handle large and small applications. Solr also supports real- time updates and can take millions of writes
per second.
SQL and Streaming Expressions/Aggregations
Streaming expressions and aggregations provide the basis for running traditional data warehouse workloads
on a search engine with the added enhancement of basing those workloads on much more complex
matching and ranking criteria.
Security Out of the Box
With Solr, Security is built-in, integrating with systems like Kerberos, SSL, and LDAP to secure the design
and the content inside of it.
Fully distributed sharding model
Solr moved from a master-replica model to a fully distributed sharding model in Solr 4 to focus on
consistency and accuracy of results over other distributed approaches.
Cross-Data Center Replication Support
Solr supports active-passive CDCR, enabling applications to synchronize indexing operations across
data centers located across regions without third-party systems.
Solr is highly Big Data enabled
Users can storeSolr’s data in HDFS. Solr integrates nicely with Hadoop’s authentication approaches, and
Solr leverages Zookeeperto simplify fault tolerance infrastructure
Documentation and Support
Solr has an extensive reference guide that covers the functional and operational aspects of Solr for
every version.
Solr and Machine Learning
Solr is actively adding capabilities to make LTR an out of the box functionality.
Scheduler Node
Apache SOLR
ChainSys Platform uses SOLR for the data cataloging
needs as an indexing and search engine.
Apache Solr was used over the others for the
following reasons.
Solr is an open-source enterprise-search platform.
Its major features include full-text search, hit highlighting,
faceted search, real-time indexing, dynamic clustering,
database integration, NoSQL features, and rich
document handling.
CouchDB throws the HTTP and REST as its primary means of communication out the window to talk
to the database directly from the client apps.
The Couch Replication Protocol lets your Data flow seamlessly between server clusters to mobile
phones and web browsers, enabling a compelling offline-first user-experience while maintaining high
performance and strong reliability.
Another unique feature of CouchDB is that it was designed from the bottom-up to
enable easy synchronization between different databases.
CouchDB has JSON as its data format.
Apache CouchDB
Chainsys Platform uses CouchDB for mobile applications
in the Application Builder module. PostgreSQL would be
the initial entry point for the Dynamic Web Applications.
The data in the PostgreSQLwill sync with CouchDB if
mobile applications are enabled. In contrast, the initial
ntry point for the Dynamic Mobile Applications would be
in the PouchDB. CouchDB syncs with the PouchDB in the
mobile devices, which then syncs with PostgreSQL.
The main feature for having CouchDB are
Deployment at
Customer
Distributed Mode
Chainsys Smart Data Platform is a highly
distributed application and with a highly
scalable environment. Most of the nodes are
horizontally and vertically scalable.
DMZ Services VM
APACHE HTTPD Server single sign on
Load
Balancer
Web Container Cluster
Web Page Services Collaborate Services API Gateway
Node1 Node n Node1 Node1 Node n
Foundation Services Cluster
Caching Node Node1 Primary Node Secondary Node
Foundation Services Cluster File/Log Services Scheduling Services
Smart Data Platform Cluster
Web Page Services
Node1 Node n
Web Page Services
Node1 Node n
Web Page Services
Web Page Services
Node1 Node n Node1 Node n
Layout Build
Design & Process Layout Rendering
Node1 Node n
Node1 Node n Node1 Node n
Node1
Database Layer
Versioning VM
Database Culster
DATABASE
Metadata
Datamart
Secondary Node
Primary Node
Metadata
Datamart
SOLR Cluster
CouchDB Cluster
Core 1
Core 2
Stave node
Master Node
Core 1
Core 2
Apache
relax
Node 1
Doc 2
Doc 2
Node 1
Doc 2
Doc 2
DMZ Nodes
Apache HTTPD would be needed in a distributed environment as a load balancer. This would also be
used as a reverse proxy for access outside the network. This would be a mandatory node to be available.
SSO Node would be needed only if there is a need for the Single-Sign-On capability with any federated
services.
Web Cluster
Chainsys recommends having a minimum of two web node clusters to handle high availability and
Load balanced for better performance. This is a mandatory node to be deployed for the
Chainsys Platform.
The number of nodes is not restricted to two and can be scaled as per the application
pages’ concurrent usage.
The Collaborate node generally is a single node, but the Node can be configured for
High Availability if needed.
Gateway Cluster
The API Gateway Nodes are not mandatory to be deployed. It would be required only when there is a
need to expose the application APIs outside the Platform.
When deployed, Chainsys would recommend having a two-node cluster to handle high availability and
load balancing in high API call expectations.
The number of nodes in the clustered can be determined based on API calls’ volume and is not
restricted to two.
Application Cluster
The HAProxy or Apache HTTPD acts as the load balancer. All the calls within the application nodes are
handled based on the node configuration. If the Apache HTTPD is used in the DMZ for Reverse Proxy,
it is recommended to have HAProxy for internal routing or a separate Apache HTTPD.
The number of nodes in the cluster is not restricted to two. Individual application nodes can be scaled
horizontally for load balancing as per the processing and mission-critical needs.
Integration Cluster is a mandatory node that will be deployed in the Platform. All the other applications
depend on this application for all the integration needs.
Visualization Cluster is also a mandatory node that will be deployed in the Platform. All the other
applications depend on this application for all the dashboard report needs.
Data Storage Nodes
Generally, the PostgreSQL database would be configured for High Availability as an Active - Passive
instance. Depending on the number of read-write operations, it can be Load balanced too. This can be
replaced by Oracle 11g or more significant if the client wants to use the existing database license.
File Server would be needed only if there is no NAS or SAN availability to mount the same disk space
into the clusters to handle the distributed logging. The NFS operations for distributed logging would
require this Node.
SVN server would be mandatory to store all the configuration objects in the repository for porting from
one instance to the other. Generally, it would be a single node as the operation on this would not be
too high.
REDIS is used as a cache engine. It is mandatory for distributed deployment. This can be configured for
high availability using the master-slave replication.
SOLR would be needed only if data cataloging is implemented, and search capability is enabled.
This can be configured for High Availability. SOLR sharding can be done when the Data is too large for
one Node or distributed to increase performance/throughput.
CouchDB would be needed only if dynamic mobile applications are to be generated. CouchDB can be
configured for high availability. For better performance, Chainsys recommends having individual
instances of CouchDB for each active application.
The visualization uses the R Studio Server for Machine Learning capabilities. It is needed only when
the Machine Learning algorithms are to be used.
When deploying the MDM, the ”Smart Application Builder” node would be needed for
the dynamic layout generation and augmentation. The vice versa doesn’t apply as
”Smart Application Builder” is not dependent on the MDM nodes.
NodeJS would be needed only when mobile applications are to be dynamically generated. The Apache
HTTPD server will handle load balancing.
The Scheduler cluster would be needed even if one of the applications use the scheduling capability.
The cluster would only be a High Availability (Failover) and not load balanced. The number of nodes is
restricted to two.
DMZ Services VM
APACHE HTTPD Server
single sign on Foundation Package
Application Services VM
Web Services
Apache
Tomcat 9 ACTIVE
Apache
MQ
Foundation Services
Smart Data Platform
Smart App Builder
Caching Service
™
™
™ ™
™ ™
Services
Smart BOTS
Services
Services
Analytics
Services
Design & Process
Services
Layout Build &
Render Services
Catalog
Services
Collaborate
services
Scheduling Services
File / log Server
Indexing VM
NoSQL VM
Versioning VM
Database VM
Metadata / Datamart
DATABASE
Deployment at
Customer
Single Node
Single Node does not mean that literally.
Here we would say that all application services produced by
the ChainSys Platform are deployed in a Single Node or Server.
The rest of the data storage nodes are separate servers
or nodes. This type of installation would generally be for a
patching environment where there are not too many operations.
These would also be recommended for non-mission critical
development activities where high availability and scalability
are not a determining factor.
Foundation Package
DMZ Nodes
Apache HTTPD would be needed only if a reverse proxy is required for access outside the network.
This is not a mandatory node for a Single Node installation.
SSO Node would be needed only if there is a need for the Single-Sign-On capability with any federated
services.
Application Server
There will be just one Apache Tomcat as the web application service and will not be configured for
high availability.
Collaborate service will have the Apache ActiveMQ and the spring integration service.
The API Gateway would be required only if the objects are to be published as a REST API or SOAP Service.
This service can be shut down if not needed.
The Integration Service, Visualization Service, and Scheduler Service would be mandatory services
running.
The rest of the applications would be running or shut down depending on the license and need.
Data Storage Nodes
PostgreSQL would be in a separate node. Chainsys does not recommend having the applications and the
Databases on the same machine.
SVN server would be mandatory to store all the configuration objects in the repository for porting from
one instance to the other.
SOLR would be needed only if data cataloging is implemented, and search capability is enabled.
CouchDB would be needed only if dynamic mobile applications are to be generated as a separate node.
Generally, the above instance propagation strategy is recommended. Depending on the applications in
use and the Load, it could be determined to go with a single node deployment or a distributed model
deployment. Generally, it is recommended to have a distributed deployment for Production instances.
The adapters are forward propagated using the SVN repository.
All the instances need not follow the same deployment model. For the reverse propagation of the example
from Production to Non-Production instances, we can clone the application and the data storage layer and
have the node configurations re-configured to the lower instances.
DEV Meta DB TST Meta DB PRD Meta DB
DEV TST/QA PRD
Deployment at
Customer
Instance Strategy
Built-in Configuration management
approaches for check-in and check-out
without leaving ChainSys Platform.
- Gives a great Software development
lifecycle process for your projects.
- All your work is protected in a secure
location and backed up regularly.
Prod Data Subnet - Tenant x Prod Application Subnet
Dev Data Subnet - Tenant x Dev Application Subnet
Prod DMZ Subnet
APACHE HTTPD Server
Dev DMZ Subnet
APACHE HTTPD Server
Private Cloud
Prod Data Subnet - Tenant n
Dev Data Subnet - Tenant 1
Prod Data Subnet - Tenant n
Dev Data Subnet - Tenant 1
Prod DMZ Subnet
APACHE HTTPD Server
Dev DMZ Subnet
APACHE HTTPD Server
Public Cloud
Virtual Network 2
Virtual Network 1
Prod Application Subnet
Dev Application Subnet
Gateway
Tenant x
Slite to slite
Tunnel
On-Premise Network -
Tenant x
Gateway
Tenant x
Slite to slite
Tunnel
On-Premise Network -
Tenant x
Gateway
Tenant x
Slite to slite
Tunnel
On-Premise Network -
Tenant x
Pure Cloud Deployment
ChainSys Platform is available on the cloud.
The Platform has been hosted as a
Public Cloud and also has the
Private Cloud options.
Public Cloud
The Site would handle connectivity to the Customer Data Center to Site tunneling between the
Tenants Data Center and the ChainSys Data Center. Individual Gateway Routers can be provisioned
per tenant.
Tenants will share the same application and DMZ node clusters except the data storage nodes.
If a tenant needs to be assigned a separate application node for the higher workloads, we can have
the particular application node-set only for that specific tenant.
As mentioned earlier in the Database section, Multi-Tenancy is handled at the database level.
Tenants will have separate database instances
The databases would be provisioned based on the license and the subscription.
Depending on the workload on the nodes, each Node can be clustered to balance the Load.
Private Cloud
Customers (Tenants) will have all applications,
DMZ nodes, and data storage nodes assigned to
the specific tenant and are not shared.
Depending on the workload on the nodes,
each Node can be clustered to balance the Load.
The application nodes and databases would be
provisioned based on the license and subscription.
This can be associated along with both the Private or Public cloud. An Agent would be deployed in the
client organization’s premises or data center to access the endpoints. This would avoid creating the
Site to Site tunnel between the Client Data Center and the ChainSys Cloud Data Center.
There is a proxy (Apache HTTPD Server) on both sides, the ChainSys Data Center and the Client Data Center.
All the back and forth communications between the ChainSys Data Center and the Agent are routed
through the proxy only. The ChainSys cloud sends instructions to the Agent to start a Task along with
the Task information. The Agent executes the Task and sends back the response to the cloud with the
Task’s status.
The Agents (for dataZap, dataZense, and Smart BOTS) would be deployed.
For dataZap, we can use the existing database (either PostgreSQL or Oracle) for the staging process.
The Agent executes all integration and migration tasks by connecting directly to the source and target
systems, validating and transforming data, and transferring data between them.
For dataZen and Smart App Builder, data would be streamed to the Chainsys Platform to manipulate
the data.
Data Center
APACHE
HTTPD
Server
DMZ Services VM
OutSide World
Datamart
End Points
DATABASE Executable
Analytics Executable
Catalog Executable
Executable
Web Nodes
Apache
Tomcat 9
Collaborate
Server
API Gateway
DMZ Nodes
single sign on
APACHE HTTPD Server
Application Clustvaer Nodes Data Store
Analytics Services
Catalog Services
Design & Process
Schedular
Node
File / Log
Server
Application Deployment Node Layout Build
Data Mart
DATABASE
Indexing Store
Caching Node
Metadata Store
DATABASE
Versioning Store
App Data Store
Apache
relax
Client Data Centre
Hybrid Cloud Deployment
Hybrid Cloud
Disaster Recovery
All the application nodes and the web nodes would be replicated using the RSYNC. The specific install
directory and any other log directories would be synced to the secondary replication nodes.
For PostgreSQL, the Streaming replication feature would be used, which used the archive log shipping.
SOLR comes up with the in-built CDCR (Cross Data Center Replication) feature, which can be used for
disaster recovery.
CouchDB has an outstanding replication architecture, which will replicate the primary database to the
secondary database.
The RPO can be set to as per the needs individually for both Applications and Databases
The RTO for the DR would be approximately an hour.
RSYNC
Streaming Replication
Archive Log Ship
CDC Replication
Replication
Primary Application & DB
Application Nodes
PostgreSQL Nodes
SOLR Nodes
CouchDB Nodes
Secondary Application & DB
Application Nodes
PostgreSQL Nodes
SOLR Nodes
CouchDB Nodes
Various data collection methods and protocols
Start to monitor all metrics instantly by using
out-of-the-box templates
Flexible trigger expressions and Trigger dependencies
Proactive network monitoring
Remote command execution
Flexible notifications
Integration with external applications using Zabbix API
Pure Cloud Deployment
ChainSys Platform is available on the cloud.
The Platform has been hosted as a
Public Cloud and also has the
Private Cloud options.
Application Monitoring
ChainSys uses third party monitoring open-source tools
uch as Zabbix and Jenkins to monitor all the nodes.
Zabbix supports tracking the Servers’ availability and
performance, Virtual Machines, Applications
(like Apache, Tomcat, ActiveMQ, and Java), and Databases
like PostgreSQL, Redis, etc.) that are used in the Platform.
Using Zabbix, the following are achieved
Third-Party Monitoring Tools
We can also use the individual
application monitoring systems
for more in-depth analysis but
having an integrated approach
to looking into the problems
helps us be proactive & faster.
Single Node
Single Node does not mean that literally.
Here we would say that all application services produced by
the ChainSys Platform are deployed in a Single Node or Server.
The rest of the data storage nodes are separate servers
or nodes. This type of installation would generally be for a
patching environment where there are not too many operations.
These would also be recommended for non-mission critical
development activities where high availability and scalability
are not a determining factor.
In-Built
Monitoring System
ChainSys is working on its Application Monitoring tool that
monitors the necessary parameters like the CPU / Memory.
This tool is also planned to help monitor individual threads
within the application. It is also intended to do most
maintenance activities like patching, cloning, and database
maintenance from one single toolset. This will be integrated
with Zabbix for monitoring and alerting systems.
Supported Endpoints ( Partial )
Oracle Sales Cloud, Oracle Marketing Cloud, Oracle Engagement Cloud,
Oracle CRM On Demand, SAP C/4HANA, SAP S/4HANA, SAP BW,
SAP Concur, SAP SuccessFactors, Salesforce, Microsoft Dynamics 365,
Workday, Infor Cloud, Procore, Planview Enterprise One
Windchill PTC, Orale Agile PLM, Oracle PLM Cloud, Teamcenter, SAP PLM,
SAP Hybris, SAP C/4HANA, Enovia, Proficy, Honeywell OptiVision,
Salesforce Sales, Salesforce Marketing, Salesforce CPQ, Salesforce Service,
Oracle Engagement Cloud, Oracle Sales Cloud, Oracle CPQ Cloud,
Oracle Service Cloud, Oracle Marketing Cloud, Microsoft Dynamics CRM
Oracle HCM Cloud, SAP SuccessFactors, Workday, ICON, SAP APO and IBP,
Oracle Taleo, Oracle Demantra, Oracle ASCP, Steelwedge
Oracle Primavera, Oracle Unifier, SAP PM, Procore, Ecosys,
Oracle EAM Cloud, Oracle Maintenance Cloud, JD Edwards EAM, IBM Maximo
OneDrive, Box, SharePoint, File Transfer Protocol (FTP), Oracle Webcenter,
Amazon S3
HIVE, Apache Impala, Apache Hbase, Snowflake, mongoDB, Elasticsearch,
SAP HANA, Hadoop, Teradata, Oracle Database, Redshift, BigQuery
mangoDB, Solr, CouchDB, Elasticsearch
PostgreSQL, Oracle Database, SAP HANA, SYBASE, DB2, SQL Server,
MySQL, memsql
IBM MQ, Active MQ
Java, .Net, Oracle PaaS, Force.com, IBM, ChainSys Platform
Oracle E-Business Suite, Oracle ERP Cloud, Oracle JD Edwards,
Oracle PeopleSoft, SAP S/4HANA, SAP ECC, IBM Maximo, Workday,
Microsoft Dynamics, Microsoft Dynamics GP, Microsoft Dynamics Nav,
Microsoft Dynamics Ax, Smart ERP, Infor, BaaN, Mapics, BPICS
Cloud
Applications
PLM, MES &
CRM
HCM & Supply
Chain Planning
Project Management
& EAM
Enterprise Storage
Systems
Big Data
No SQL Databases
Databases
Message Broker
Development
Platform
Enterprise
Applications
One Platform for your
Data Management needs
End to End
www.chainsys.com
Data Migration
Data Reconciliation
Data Integaration
Data Quality Management
Data Governance
Analytical MDM
Data Analytics
Data Catalog
Data Security & Compliance

More Related Content

ODP
Authentication and Single Sing on
PPTX
Introduction-to-Blood-Bank-and-Donor-Management-System.pptx
PPT
SOSCOE Overview
PPT
Adobe PDF and LiveCycle ES Security
DOCX
How the detailed process of soa
PDF
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
PDF
Web Based Investment Management System
PPTX
Granite state #spug The #microsoftGraph and #SPFx on steroids with #AzureFunc...
Authentication and Single Sing on
Introduction-to-Blood-Bank-and-Donor-Management-System.pptx
SOSCOE Overview
Adobe PDF and LiveCycle ES Security
How the detailed process of soa
APIsecure 2023 - API orchestration: to build resilient applications, Cherish ...
Web Based Investment Management System
Granite state #spug The #microsoftGraph and #SPFx on steroids with #AzureFunc...

Similar to Technical Architecture - Chainsys dataZap (20)

PPTX
presentation_finals
PPTX
Internship msc cs
PPTX
Blockchain solution architecture deliverable
PPTX
Disaster_Reovery1_Patrol_Continuity.pptx
PDF
Dairy management system project report..pdf
PPTX
Azure. Is It Worth It? - TechEd Beijing 2010 - Ethos
DOCX
Middleware – Its Types, Architecture, and Benefits.docx
DOCX
Web based booking a car taxi5
DOCX
project on Agile approach
PDF
PARKING ALLOTMENT SYSTEM PROJECT REPORT REPORT.
PPTX
Observe It Presentation
PDF
International Journal of Engineering Inventions (IJEI)
PDF
Transforming The Customer Experience With Real-Time Insights
DOC
Java project titles
PDF
Online airline reservation system project report.pdf
PPTX
Online news 365
PPT
Cartes Asia Dem 2010 V2
PPT
Share Point Server Security with Joel Oleson
PDF
travel portal for flights booking trave
PDF
ghgh.pdf travel portal for flights booking right
presentation_finals
Internship msc cs
Blockchain solution architecture deliverable
Disaster_Reovery1_Patrol_Continuity.pptx
Dairy management system project report..pdf
Azure. Is It Worth It? - TechEd Beijing 2010 - Ethos
Middleware – Its Types, Architecture, and Benefits.docx
Web based booking a car taxi5
project on Agile approach
PARKING ALLOTMENT SYSTEM PROJECT REPORT REPORT.
Observe It Presentation
International Journal of Engineering Inventions (IJEI)
Transforming The Customer Experience With Real-Time Insights
Java project titles
Online airline reservation system project report.pdf
Online news 365
Cartes Asia Dem 2010 V2
Share Point Server Security with Joel Oleson
travel portal for flights booking trave
ghgh.pdf travel portal for flights booking right
Ad

More from Chainsys SEO (12)

PDF
dataZen - Customer Data Info Management (CDM)
PDF
A Proven Approach to Enterprise Data Reconciliation
PDF
Data Sheet dataZap for Data and Setup Migration
PDF
Enterprise Data A Proven Approach to Cleansing and Migration
PDF
Oracle Cloud Applications Master and Transactions Templates
PDF
SAP S4 HANA Applications Setup Master and Transactions Templates
PDF
Oracle Cloud Application dataZap Setup Templates
PDF
Data Sheet Cloud Integration Platform - dataZap
PDF
Seamless Data Migration to Oracle Fusion Cloud
PDF
SAP to Hadoop data integration process Steps
PDF
Archive Smart & Save Big Time with ChainSys AI-Powered Smart Data Platform
PDF
Strategic Data Migration From Oracle NetSuite to Cloud Fusion Final.pdf
dataZen - Customer Data Info Management (CDM)
A Proven Approach to Enterprise Data Reconciliation
Data Sheet dataZap for Data and Setup Migration
Enterprise Data A Proven Approach to Cleansing and Migration
Oracle Cloud Applications Master and Transactions Templates
SAP S4 HANA Applications Setup Master and Transactions Templates
Oracle Cloud Application dataZap Setup Templates
Data Sheet Cloud Integration Platform - dataZap
Seamless Data Migration to Oracle Fusion Cloud
SAP to Hadoop data integration process Steps
Archive Smart & Save Big Time with ChainSys AI-Powered Smart Data Platform
Strategic Data Migration From Oracle NetSuite to Cloud Fusion Final.pdf
Ad

Recently uploaded (20)

PDF
Building a Smart Pet Ecosystem: A Full Introduction to Zhejiang Beijing Techn...
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
How to Get Funding for Your Trucking Business
PPT
Lecture 3344;;,,(,(((((((((((((((((((((((
PDF
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
PDF
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
PPTX
operations management : demand supply ch
PPTX
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
PDF
Introduction to Generative Engine Optimization (GEO)
PDF
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PPTX
2025 Product Deck V1.0.pptxCATALOGTCLCIA
PPTX
Astra-Investor- business Presentation (1).pptx
PDF
Keppel_Proposed Divestment of M1 Limited
PDF
Cours de Système d'information about ERP.pdf
PDF
Digital Marketing & E-commerce Certificate Glossary.pdf.................
PDF
Charisse Litchman: A Maverick Making Neurological Care More Accessible
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
PDF
Blood Collected straight from the donor into a blood bag and mixed with an an...
Building a Smart Pet Ecosystem: A Full Introduction to Zhejiang Beijing Techn...
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
How to Get Funding for Your Trucking Business
Lecture 3344;;,,(,(((((((((((((((((((((((
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
operations management : demand supply ch
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
Introduction to Generative Engine Optimization (GEO)
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
2025 Product Deck V1.0.pptxCATALOGTCLCIA
Astra-Investor- business Presentation (1).pptx
Keppel_Proposed Divestment of M1 Limited
Cours de Système d'information about ERP.pdf
Digital Marketing & E-commerce Certificate Glossary.pdf.................
Charisse Litchman: A Maverick Making Neurological Care More Accessible
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
Blood Collected straight from the donor into a blood bag and mixed with an an...

Technical Architecture - Chainsys dataZap

  • 2. Objectives ChainSys’ Smart Data Platform enables the business to achieve these critical needs. 1. Empower the organization to be data-driven 2. All your data management problems solved 3. World class innovation at an accessible price Subash Chandar Elango Chief Product Officer ChainSys Corporation Subash's expertise in the data management sphere is unparalleled. As the creative & technical brain behind ChainSys' products, no problem is too big for Subash, and he has been part of hundreds of data projects worldwide.
  • 3. Introduction This document describes the Technical Architecture of the Chainsys Platform Purpose Scope The purpose of this Technical Architecture is to define the technologies, products, and techniques necessary to develop and support the system and to ensure that the system components are compatible and comply with the enterprise-wide standards and direction defined by the Agency. The document's scope is to identify and explain the advantages and risks inherent in this Technical Architecture. This document is not intended to address the installation and configuration details of the actual implementation. Installation and configuration details are provided in technology guides produced during the project. Audience The intended audience for this document is Project Stakeholders, technical architects, and deployment architects
  • 4. Platform Component Definition The system's overall architecture goals are to provide a highly available, scalable, & flexible data management platform A key Architectural goal is to leverage industry best practices to design and develop a scalable, enterprise-wide J2EE application and follow the industry-standard development guidelines. All aspects of Security must be developed and built within the application and be based on Best Practices. Security User Management Base Component Gateway Component Authentication / Authorization / Crypto User / Groups Roles / Responsibility Access Manager Workflow Versioning Notification Logging Scheduler Object Manager API Gateway Data Quality Management Master Data Governance Analytical MDM (Customer 360, Supplier360, Product 360) Data Migration Setup Migration Test data Prep Big Data Ingestion Data Archival Data Reconciliation Data Integration Data Masking Data Compliance (PII, GDPR, CCPA, OIOO) Data Cataloging Data Analytics Data Visualizat Used for Autonomous Regression Testing Used for Load and Performance Testing Rapid Application Develiopment (RAD) Framework Visual Development Approach Drag & Drop Design Tools Functional Components into Visual Workflow Foundation Smart data platform Smart Business Platform Architecture Goals
  • 5. The Platform Foundation forms the base on which the entire Platform is built. The major components that create the Platform are described in brief. Security Management User Management Base Components Gateway Component Users User Groups Object Access Manager Responsibilities Role Hierarchy JWT SAML OAuth2.0 Federated Authentication Platform Authentication Credential Authentication AD LDAP Authentication Service Credential Authenticator SSO Authenticator Authorization Engine Org/License Authorization App / Node Authorization Access Authorization Hashing Algorithm Asymmetric Encryption Crypto Engine MD5 SHA1 AES 128 AES 256 Platform API API Gateway Engine Login API REST API Publisher SOAP Service Publisher Job Feedback Workflow Logging Constructs Application Logs Execution Logs Audit Logs Approvals Activities SLA Collaborate Versioning EMAIL SVN GIT Database Web Notification Chats Platform Object Manager Object Sharing Scheduler Job Schedular Job Feedback Dependent Sharing Sharing Manager Platform Foundation
  • 6. The component manages all the Roles, Responsibilities, Hierarchy, Users, & User Groups. The Platform comes with the preconfigured Responsibilities for dataZap, dataZen, and dataZense. Organizations can customize Responsibilities and are assigned to the platform objects with additional privileges. The Platform comes with the predefined Roles for dataZap, dataZen, and dataZense. Organizations can create their Roles. The Role-based hierarchy is also configured for the user level hierarchy. The roles are assigned with the default responsibilities. The users will be assigned these applications that are necessary for them. The User will be given a Role. The hierarchy formed using the role hierarchy setup where a manager from the next role is assigned. The responsibility against these roles is set by default for the users. The User can be given more responsibilities or revoked an existing responsibility against a role. Users gain access to the objects based on the privileges assigned for the responsibility. User Management Responsibilities Users Roles
  • 7. SSL Authentication Engine The Platform is SSL / HTTPS enabled on the transport layer with TLS 1.2 support. The SSL is applied to the nodes exposed to the users like the DMZ nodes and Web nodes and the nodes exposed to the third-party applications like the API Gateway nodes. The Platform offers a Credential based authentication handled by the Platform and also Single Sign-On based federated authentication. Both SSO and Credential authentication can co-exist for an organization. User authentication on the Platform happens with the supplied credentials. All the successful sessions are logged, and failed attempts are tracked at the user level for locking the user account. A user can have only one web session at any given point in time. Password policy, including expiry, is configured at the Organization level, applicable for all users. Enforced password complexity like. Credential Authentication SSO can be set up with federated services like SAML, OAuth 2.0, or JWT (Java Web Tokens). Setup for an IDP would be configured against the organization profile, and authentication would happen in the IDP. This can either be IDP initiated or SP (Chainsys Smart Data Platform) initiated. The organization users with SSO would get a different context to login. Single Sign-On Min length Max length Usage of Numbers, Cases, and Special Characters can be set. No of unsuccessful attempts are also configurable Security Management The security management component takes care of the following
  • 8. Authorization Engine The first level of authorization would be the Organization License. The Licensing engine would be used to setup the organization for the authentications too Authorization Engine The Crypto Engine handles both asymmetric encryption and hashing algorithms Authorization Engine The workflow engine is created to manage the orchestration of the flow of activities. The workflow engine is part of the platform foundation extended by the applications to add application-specific activities. Version Management This component helps in handling the version of objects and records eligible for versioning. The foundation has the API to version the objects and its records and can be extended by the applications to add specific functionalities. Currently, the Platform supports SVN as default and also supports database-level version management. Support for GIT is on the roadmap. Notification Engine The notification engine is the component that will do all the notifications to the User in the system. The feature helps notify the users on the page when online in the application. The other notifications like Mail notification and Chat Notification are also part of this component. Logging Engine All activity logs, both foundation, and application are handled to understand and help in the debugging. AES 128 is the default encryption algorithm but also supports 256 bits The keys are managed within the Platform at the organization level. The usage of keys maintained at the application level determines how they are used for encryption and decryption.tv All the internal passwords are being stored by default with MD5 hashing Encryption of the stored data can be done at the Database layer as needed The next level of authentication would be the Applications assigned to the Organization and the respective User. The individual application nodes would be given to the organization as per the service agreement to handle load balancing and high availability Authorization of pages happens with the responsibilities assigned to the users Authorization of a record happens concerning sharing the records to a group or individual users Responsibility and sharing will have the respective privileges to the pages and records On conflict, the Principle of least privilege is used to determine the access
  • 9. The login service is the one that authenticates if the requested consumer has the proper authentication or credentials to invoke the job or action. The publisher engine has two methods of authentication. Login Service The eligible jobs or actions can be published using the Simple Object Access Protocol (SOAP). SOAP is a messaging protocol that allows programs that run on disparate operating systems to communicate using Hypertext Transfer Protocol (HTTP) and its Extensible Markup Language (XML). SOAP Service The eligible jobs or actions can be published using the Representational State Transfer Protocol (REST). REST communicates using the HTTP like SOAP and can have messages in multiple formats. In dataZap, we will publish in the XML format or the JSON (JavaScript Object Notation) format. REST Service Inline authentication - where all the requests will have the Credential for authentication and access control Session Authentication - This service is explicitly invoked to get the token and gather the other published services using this token to authorize the request. It enables you to schedule a job once or regularly. In terms of recurring jobs are planned minutely, hourly, weekly, and monthly. Scheduler Creation The scheduler execution engine uses the configuration and fires the job in the respective application. The next job would be scheduled at the end of each job as per the configuration. Scheduler Execution The scheduled jobs are monitored and keep track of the progress and status at any stage. If there is any delay in the expected job or unexpected errors, the responsible users are notified accordingly for actions. Job Monitoring API Gateway Engine The API Gateway forms the foundation for publishing and consuming services with the Platform. All the eligible jobs or actions can be published for external applications to access. The following are the components that would form the publishing node.
  • 10. dataZap Component The Execution Handler will be available on the client-side and at the cloud to handle the pure cloud environment and manipulate data in the cloud for less Load at the client end. The Execution controller will be available in the cloud to direct the execution handler in every step. Execution Controller End Points Endpoints Connector Extract Adapter Dataflow Adapter Platform Endpoints Foundation Engine Load Adapter Active Transformation CDC Engine Filter Engine Child Iterator Crypto Engine Data Stream Service Endpoint Extract Engine Mapper Foundation Engine Lookup Sequence Expression Lookup Sequence Expression Pre-Load Post Load Data Load Engine JCO JDBC {Rest} {Soap} OData Ingestion Engine Crypto Engine Passive Transformation Validation Engine Reprocessing Engine Reconciliation Engine Relational Databases Cloud Applications Big Data Lake No SQL Databases Enterprise Storage System Message Broker Enterprise Applications Data Object Engine Aggregator Sorter Unifer Reporting Engine Visualization API Process Flow Process Flow Adapter Scheduler Job Initiation Exception Notification Reconciliation Adapter Comparator Visualization API API Gateway REST API Publisher Migration Flow Master / Transaction Flow Setup Migration Flow Versioning Engine Object Versioning Lookup Expression BOTS Playback Builder Objects Joiner Normalizer Router Execution Engine Localized Transformation API TAPI Compartor
  • 11. Endpoint Connectors The component has all the base connectors used to connect to most of the endpoint applications. The base connectors would include JDBC - For all the RDBMS connections SAP JCo - For connecting to the SAP Systems SOAP - Connects to applications enabled with SOAP APIs. REST - Connects to applications enabled with REST APIs OData - Connects to applications enables with OData APIs FTP - To connect and extract data from files in the FTP sites. NoSQL - To connect to databases with NoSQL like Mongo Message Broker - To connect to different messaging services like ActiveMQ, IBM MQ The connections can be made secure based on the endpoint configuration by Secured Layer through all of the above base connectors. The specific connectors for Enterprise applications are wrappers built over these base connectors with specific Security and governance applied as per the application needs. The diagram shows a few of the existing wrapper endpoints created for the enterprise applications in the market. We can use the base connectors for applications that do not have specific connectors if they do not have any particular authentication methods other than the base level authentications provided. ChainSys would build the applications specific connectors if not already exists.
  • 12. Load Adapter The component that handles the loading of data into multiple systems or endpoints. Ingestion Engine Crypto Engine Initial Data marting happens in this Engine and then manipulate the data. The Crypto Engine enables us to encrypt the data during the data marting process to ensure Data is protected in all formats. It also has the decryption to be applied before loading the data to the final target endpoint. Reconciliation Engine The reconciliation engine handles the technical reconciliation of the data in two stages. The reconciliation is done at the end of the pre-validation stage to determine the differences between the raw data and the transformed data and further after the loading complete to decide the differences between the loaded and the raw or changed data. Reprocessing Engine This component helps to correct the errored data both at the pre-validation level and the post-load level. Data fix can be handled both online and offline. Users will be able to download the error data as an excel and upload the corrected Data as a bulk update process. In addition to the data error correction, we can also enhance or construct data that will pass through the validation step for quality. Loading Engine The loading engine is where the application understands the endpoint type and uses one of the loading engines to load the data into the target application. The loading engine also has special adapters to use the Playback Adapters of the Smart BOTS and Smart App Builder in the business platform. Transformation Passive transformation This transformation just changes the values of the columns from one form to another. The different transformations types like lookup transformation, sequence transformation, and expression-based transformation can be performed. TAPI TAPI helps create reusable transformations (API) in multiple objects to make changes in one place rather than in numerous areas.
  • 13. Extract Adapter The extract adapter component retrieves data from multiple different types of endpoints and processes the data to give the data in the expected format. This Engine handles almost all different kinds of systems and formats to retrieve data. It can work with SQL / Flat files / SOAP and REST service. Data Object Engine This is to reduce the number of rows from the raw data extracted from the source. Filters This component handles the master child relationship between the data extracts so that the filter applied on the master can get down to all the child levels. Child Iterator The Crypto Engine is to read or extract the data with encryption applied over the selected fields for extraction. This helps in encrypting the data from being accessed from the front end or the back end. Crypto Engine Here the data extracted from the data object or the extract adapter are streamed to the applications calling the service to pull the data from the endpoint. Data Streaming Service This is the component that gets the changed data from the source. Two modes can achieve this Changed Data Capture The recommended option is by assigning the date field to be used for bringing the changed data. There is also an option to bring in data by comparing the records and is supposed to be resource- intensive and is not recommended until there is no date field to compare.
  • 14. This transformation where the number of rows is getting affected. The possible active transformations available are the Normalizer, Joiner, Router, Unifier, Aggregator, and Sorter. The active transformation engines convert the data structures from the source to the target. Active transformation It also has the rules engine (Router) to move the data as per the rules to different endpoints. It also can compare the data between the two systems and determine action before moving the data. Dataflow Adapter Migration Flow The dataflow adapter helps in transforming and mapping the data from multiple sources to multiple target systems. This component overrides the workflow component in the foundation. The migration flow engine is specific to the migrating Master or Transaction Data. The feature has orchestration capabilities and human intervention capabilities like Approval, User Confirmation, and Receive Input. Process flow Engine This component overrides the workflow component in the foundation. The process flow engine is specific to the data movement. The feature has all the orchestration capabilities and human intervention capabilities like Approval, User Confirmation, and Receive Input. Scheduler These components provide the job's execution agents specific to dataZap that needs to be executed by the base scheduling engine. These are wrappers for the data movement components like Load Adapters, Extract Adapters, Dataflow Adapters, and Process Flow. API Gateway These are execution agents for the publisher in the foundation. These form the wrapper to the jobs that need to be executed in the data movement components like Load Adapters, Extract Adapters, Dataflow Adapters, and Process Flow.
  • 15. API Gateway These are execution agents for the publisher in the foundation. These form the wrapper to the jobs that need to be executed in the data movement components like Load Adapters, Extract Adapters, Dataflow Adapters, and Process Flow. Reconciliation Adapter The reconciliation adapter generates the query to compare the data and produce the Visualization API result to create the necessary reconciliation dashboard. Reporting Engine The reporting engine generates reports on the various adapters' execution and produces dashboards to understand the actions taken and to be taken.
  • 16. 11 dataZap Component The Execution Handler will be available on the client-side and at the cloud to handle the pure cloud environment and manipulate data in the cloud for less Load at the client end. The Execution controller will be available in the cloud to direct the execution handler in every step. System Technology Landscape DMZ Nodes DMZ Nodes DMZ Nodes DMZ Nodes APACHE HTTPD Server Web Load Balancing Reverse Proxy Forward Proxy single sign On Foundation Nodes Default Data Stores Caching Node Schedular Node File / Log Server DATABASE Metadata Store Versioning Store Web Application Apache Tomcat 9 11 ACTIVE Apache MQ Collaborate Server API Gateway dimple.js R Analytics 12.16 V4 Selenium WebDriver Data Mart Indexing Store App Data Store DATABASE Apache relax
  • 17. Apache HTTPD The Apache HTTPD server is used to route the calls to the Web nodes. The server also handles the load balancing for both the Web Server Nodes and the API gateway Nodes. The following features are used in the Apache HTTPD Single Sign-On This Node is built on the Spring Boot application with Tomcat as the Servlet container. Organizations opting to have a single sign-on would have a separate SSO node with a particular context. The default context will take them to the platform-based authentication. Highly scalable Forward / Reverse proxy with caching Multiple load balancing mechanisms Fault tolerance and Failover with automatic recovery WebSocket support with caching Fine-grained authentication and authorization access control Loadable Dynamic Modules like ModSecurity for WAF etc. TLS/SSL with SNI and OCSP stapling support This layer consists of the nodes exposed to the users for invoking the actions throughfrontend or a third-party application asAPI’s. The nodes available in this layer would be theWeb Server to render the web pages, API Gateway for other applications to interact with the application, and the collaborate node for notifications. Web Nodes DMZ Nodes These nodes are generally the only nodes exposed to the external world outside the enterprise network. The two nodes in this layer are the Apache HTTPD server and the "Single Sign O" Node.
  • 18. Apache Tomcat 9.x is used as the servlet container. JDK 11 is the JRE used for the application. The Platform works on OpenJDK / Azul Zulu / AWS Corretto and Oracle JDK. Struts 1.3 platforms are used as the controllers Integration between the webserver to the application nodes is handled with Microservices based on the SpringBoot The presentation layer uses HTML 5 / CSS 3 components and uses many scripting frameworks like JQuery, d3js, etc. The web server can be clustered to n- nodes as per the number of concurrent users and requests. This Node uses the service of Jetty to publish the API as SOAP or REST API. The API Gateway can be clustered based on the number of concurrent API calls from the external systems. The Denial of Service (DoS) is accomplished in both JAX-WS and JAX-RS to prevent illegal attacks. Web Server The web application server hosts all the web pages of the chainsys platform. Gateway Node This Node uses all the default application services. The notification engine uses netty APIs for sending notification from the Platform. Apache Active MQ is used for messaging the notification from application nodes. Collaborate This Node is used to handle all different kinds of notifications to the users like front end notifications, emails, push notifications (in the roadmap). This Node also has the chat services enabled that can be used by the applications as needed
  • 19. Node Node Node (Analytical Services / Catalog Services) The application uses only the default services that are mentioned above. The application uses only the default services that are mentioned above. The application uses all the default services that are mentioned above. In addition to this, it also uses R analytics for Machine Learning algorithms. It also uses D3 and Dimple JS for the visual layer. Application Nodes The application nodes are spring boot applications for communicating between theother application nodes and web servers. Load Balancing is handled by the HAProxy based on the number of nodes instantiated for each application. JDK 11 is the JRE used for the application. The Platform works on OpenJDK / AzulZulu / AWS Corretto and Oracle JDK.
  • 20. The application uses all the default services that are mentioned above. In addition to this, it also uses the Selenium API for web-based automation and Sikuli. The application uses all the default services that are mentioned above. These services are used to configure the custom applications and to generate dynamic web applications as configured. The mobile applications' service would need NodeJS 12.16, which would use the IonicFramework V4 to build the web and mobile apps for the configured custom applications.
  • 21. Data Storage Nodes Database Chainsys Platform supports both PostgreSQL 9.6 or higher and Oracle 11g or higher databases for both The Platform uses PostgreSQL for the Metadata in the cloud. PostgreSQL is a highly scalable database. Metadata of the setups and configurations of the applications Data marting for the temporary storage of the data. Scheduler Node This Node uses only the default application node services. This Node can be clustered only as failover nodes. When the primary Node is down, the HAProxy makes the secondary Node the primary Node The secondary Node handles notifications, automatic rescheduling of the jobs. It calls each of the application objects that are schedulable to take all the possible exception scenarios to be addressed. Once the Node is up and running, this will become the secondary Node. Designed to scale vertically by running on more significant and faster servers when you need more performance Can be configured to do horizontal scaling, Postgres has useful streaming replication features so you can create multiple replicas that can be used for reading Data It can be easily configured for High Availability based on the above. 1 2 3
  • 22. Cache Server Redis cache is used for caching the platform configuration objects and execution progress information. This helps to avoid network latency across the database and thus increases the performance of the application. When the durability of Data is not needed, the in-memory nature of Redis allows it to perform well compared to database systems that write every change to disk before considering a transaction committed. The component is set up as a distributed cache service to enable better performance during data access. Redis cache can be made HA enabled clusters. Redis supports master-replica replication Multi-tenant database architecture has been designed based on the following Password Storage Encryption Encryption For Specific Columns Data Partition Encryption Encrypting Passwords Across A Network Encrypting Data Across A Network SSL Host Authentication Client-Side Encryption Separate Databases approach for each tenant Trusted Database connections for each tenant Secure Database tables for each tenant Easily extensible Custom columns Scalability is handled on Single Tenant scaleout PostgreSQL offers encryption at several levels and provides flexibility in protecting data from disclosure due to database server theft, unscrupulous administrators, and insecure networks. Encryption might also be required to secure s ensitive data.
  • 23. Loader Adapters, Data Objects, Data Extracts, Data Flows, Process Flows, Migrations Flows, Reconciliations Data Model, Rules, Augmentations, Workflow Data Set, Views, Dashboards, Ad-hoc Reports Object Model, Layouts, Workflow File Log Server This component is used for centralized logging, which handles the application logs, execution logs, and error logs in the platform applications' common server. Log4J is used for distributed logging. These logs can be downloaded for monitoring and auditing purposes. A small Http service gets executed, which allows the users to download the file from this component—implemented with the Single Tenant scaleout approach. Subversion (SVN) Server Apache Subversion (abbreviated as SVN) is a software versioning and revision control system distributed as open-source under the Apache License. The Platform uses SVN to version all the metadata configurations to revert in the same instance or move the configurations to multiple instances for different milestones. All the applications in the Platform use the foundation APIs to version their objects as needed.
  • 24. Real-Time, Massive Read, and Write Scalability Solr supports large-scale, distributed indexing, search, and aggregation/statistics operations, enabling it to handle large and small applications. Solr also supports real- time updates and can take millions of writes per second. SQL and Streaming Expressions/Aggregations Streaming expressions and aggregations provide the basis for running traditional data warehouse workloads on a search engine with the added enhancement of basing those workloads on much more complex matching and ranking criteria. Security Out of the Box With Solr, Security is built-in, integrating with systems like Kerberos, SSL, and LDAP to secure the design and the content inside of it. Fully distributed sharding model Solr moved from a master-replica model to a fully distributed sharding model in Solr 4 to focus on consistency and accuracy of results over other distributed approaches. Cross-Data Center Replication Support Solr supports active-passive CDCR, enabling applications to synchronize indexing operations across data centers located across regions without third-party systems. Solr is highly Big Data enabled Users can storeSolr’s data in HDFS. Solr integrates nicely with Hadoop’s authentication approaches, and Solr leverages Zookeeperto simplify fault tolerance infrastructure Documentation and Support Solr has an extensive reference guide that covers the functional and operational aspects of Solr for every version. Solr and Machine Learning Solr is actively adding capabilities to make LTR an out of the box functionality. Scheduler Node Apache SOLR ChainSys Platform uses SOLR for the data cataloging needs as an indexing and search engine. Apache Solr was used over the others for the following reasons. Solr is an open-source enterprise-search platform. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features, and rich document handling.
  • 25. CouchDB throws the HTTP and REST as its primary means of communication out the window to talk to the database directly from the client apps. The Couch Replication Protocol lets your Data flow seamlessly between server clusters to mobile phones and web browsers, enabling a compelling offline-first user-experience while maintaining high performance and strong reliability. Another unique feature of CouchDB is that it was designed from the bottom-up to enable easy synchronization between different databases. CouchDB has JSON as its data format. Apache CouchDB Chainsys Platform uses CouchDB for mobile applications in the Application Builder module. PostgreSQL would be the initial entry point for the Dynamic Web Applications. The data in the PostgreSQLwill sync with CouchDB if mobile applications are enabled. In contrast, the initial ntry point for the Dynamic Mobile Applications would be in the PouchDB. CouchDB syncs with the PouchDB in the mobile devices, which then syncs with PostgreSQL. The main feature for having CouchDB are
  • 26. Deployment at Customer Distributed Mode Chainsys Smart Data Platform is a highly distributed application and with a highly scalable environment. Most of the nodes are horizontally and vertically scalable. DMZ Services VM APACHE HTTPD Server single sign on Load Balancer Web Container Cluster Web Page Services Collaborate Services API Gateway Node1 Node n Node1 Node1 Node n Foundation Services Cluster Caching Node Node1 Primary Node Secondary Node Foundation Services Cluster File/Log Services Scheduling Services Smart Data Platform Cluster Web Page Services Node1 Node n Web Page Services Node1 Node n Web Page Services Web Page Services Node1 Node n Node1 Node n Layout Build Design & Process Layout Rendering Node1 Node n Node1 Node n Node1 Node n Node1 Database Layer Versioning VM Database Culster DATABASE Metadata Datamart Secondary Node Primary Node Metadata Datamart SOLR Cluster CouchDB Cluster Core 1 Core 2 Stave node Master Node Core 1 Core 2 Apache relax Node 1 Doc 2 Doc 2 Node 1 Doc 2 Doc 2
  • 27. DMZ Nodes Apache HTTPD would be needed in a distributed environment as a load balancer. This would also be used as a reverse proxy for access outside the network. This would be a mandatory node to be available. SSO Node would be needed only if there is a need for the Single-Sign-On capability with any federated services. Web Cluster Chainsys recommends having a minimum of two web node clusters to handle high availability and Load balanced for better performance. This is a mandatory node to be deployed for the Chainsys Platform. The number of nodes is not restricted to two and can be scaled as per the application pages’ concurrent usage. The Collaborate node generally is a single node, but the Node can be configured for High Availability if needed. Gateway Cluster The API Gateway Nodes are not mandatory to be deployed. It would be required only when there is a need to expose the application APIs outside the Platform. When deployed, Chainsys would recommend having a two-node cluster to handle high availability and load balancing in high API call expectations. The number of nodes in the clustered can be determined based on API calls’ volume and is not restricted to two. Application Cluster The HAProxy or Apache HTTPD acts as the load balancer. All the calls within the application nodes are handled based on the node configuration. If the Apache HTTPD is used in the DMZ for Reverse Proxy, it is recommended to have HAProxy for internal routing or a separate Apache HTTPD. The number of nodes in the cluster is not restricted to two. Individual application nodes can be scaled horizontally for load balancing as per the processing and mission-critical needs. Integration Cluster is a mandatory node that will be deployed in the Platform. All the other applications depend on this application for all the integration needs. Visualization Cluster is also a mandatory node that will be deployed in the Platform. All the other applications depend on this application for all the dashboard report needs.
  • 28. Data Storage Nodes Generally, the PostgreSQL database would be configured for High Availability as an Active - Passive instance. Depending on the number of read-write operations, it can be Load balanced too. This can be replaced by Oracle 11g or more significant if the client wants to use the existing database license. File Server would be needed only if there is no NAS or SAN availability to mount the same disk space into the clusters to handle the distributed logging. The NFS operations for distributed logging would require this Node. SVN server would be mandatory to store all the configuration objects in the repository for porting from one instance to the other. Generally, it would be a single node as the operation on this would not be too high. REDIS is used as a cache engine. It is mandatory for distributed deployment. This can be configured for high availability using the master-slave replication. SOLR would be needed only if data cataloging is implemented, and search capability is enabled. This can be configured for High Availability. SOLR sharding can be done when the Data is too large for one Node or distributed to increase performance/throughput. CouchDB would be needed only if dynamic mobile applications are to be generated. CouchDB can be configured for high availability. For better performance, Chainsys recommends having individual instances of CouchDB for each active application. The visualization uses the R Studio Server for Machine Learning capabilities. It is needed only when the Machine Learning algorithms are to be used. When deploying the MDM, the ”Smart Application Builder” node would be needed for the dynamic layout generation and augmentation. The vice versa doesn’t apply as ”Smart Application Builder” is not dependent on the MDM nodes. NodeJS would be needed only when mobile applications are to be dynamically generated. The Apache HTTPD server will handle load balancing. The Scheduler cluster would be needed even if one of the applications use the scheduling capability. The cluster would only be a High Availability (Failover) and not load balanced. The number of nodes is restricted to two.
  • 29. DMZ Services VM APACHE HTTPD Server single sign on Foundation Package Application Services VM Web Services Apache Tomcat 9 ACTIVE Apache MQ Foundation Services Smart Data Platform Smart App Builder Caching Service ™ ™ ™ ™ ™ ™ Services Smart BOTS Services Services Analytics Services Design & Process Services Layout Build & Render Services Catalog Services Collaborate services Scheduling Services File / log Server Indexing VM NoSQL VM Versioning VM Database VM Metadata / Datamart DATABASE Deployment at Customer Single Node Single Node does not mean that literally. Here we would say that all application services produced by the ChainSys Platform are deployed in a Single Node or Server. The rest of the data storage nodes are separate servers or nodes. This type of installation would generally be for a patching environment where there are not too many operations. These would also be recommended for non-mission critical development activities where high availability and scalability are not a determining factor. Foundation Package
  • 30. DMZ Nodes Apache HTTPD would be needed only if a reverse proxy is required for access outside the network. This is not a mandatory node for a Single Node installation. SSO Node would be needed only if there is a need for the Single-Sign-On capability with any federated services. Application Server There will be just one Apache Tomcat as the web application service and will not be configured for high availability. Collaborate service will have the Apache ActiveMQ and the spring integration service. The API Gateway would be required only if the objects are to be published as a REST API or SOAP Service. This service can be shut down if not needed. The Integration Service, Visualization Service, and Scheduler Service would be mandatory services running. The rest of the applications would be running or shut down depending on the license and need. Data Storage Nodes PostgreSQL would be in a separate node. Chainsys does not recommend having the applications and the Databases on the same machine. SVN server would be mandatory to store all the configuration objects in the repository for porting from one instance to the other. SOLR would be needed only if data cataloging is implemented, and search capability is enabled. CouchDB would be needed only if dynamic mobile applications are to be generated as a separate node.
  • 31. Generally, the above instance propagation strategy is recommended. Depending on the applications in use and the Load, it could be determined to go with a single node deployment or a distributed model deployment. Generally, it is recommended to have a distributed deployment for Production instances. The adapters are forward propagated using the SVN repository. All the instances need not follow the same deployment model. For the reverse propagation of the example from Production to Non-Production instances, we can clone the application and the data storage layer and have the node configurations re-configured to the lower instances. DEV Meta DB TST Meta DB PRD Meta DB DEV TST/QA PRD Deployment at Customer Instance Strategy Built-in Configuration management approaches for check-in and check-out without leaving ChainSys Platform. - Gives a great Software development lifecycle process for your projects. - All your work is protected in a secure location and backed up regularly.
  • 32. Prod Data Subnet - Tenant x Prod Application Subnet Dev Data Subnet - Tenant x Dev Application Subnet Prod DMZ Subnet APACHE HTTPD Server Dev DMZ Subnet APACHE HTTPD Server Private Cloud Prod Data Subnet - Tenant n Dev Data Subnet - Tenant 1 Prod Data Subnet - Tenant n Dev Data Subnet - Tenant 1 Prod DMZ Subnet APACHE HTTPD Server Dev DMZ Subnet APACHE HTTPD Server Public Cloud Virtual Network 2 Virtual Network 1 Prod Application Subnet Dev Application Subnet Gateway Tenant x Slite to slite Tunnel On-Premise Network - Tenant x Gateway Tenant x Slite to slite Tunnel On-Premise Network - Tenant x Gateway Tenant x Slite to slite Tunnel On-Premise Network - Tenant x Pure Cloud Deployment ChainSys Platform is available on the cloud. The Platform has been hosted as a Public Cloud and also has the Private Cloud options.
  • 33. Public Cloud The Site would handle connectivity to the Customer Data Center to Site tunneling between the Tenants Data Center and the ChainSys Data Center. Individual Gateway Routers can be provisioned per tenant. Tenants will share the same application and DMZ node clusters except the data storage nodes. If a tenant needs to be assigned a separate application node for the higher workloads, we can have the particular application node-set only for that specific tenant. As mentioned earlier in the Database section, Multi-Tenancy is handled at the database level. Tenants will have separate database instances The databases would be provisioned based on the license and the subscription. Depending on the workload on the nodes, each Node can be clustered to balance the Load. Private Cloud Customers (Tenants) will have all applications, DMZ nodes, and data storage nodes assigned to the specific tenant and are not shared. Depending on the workload on the nodes, each Node can be clustered to balance the Load. The application nodes and databases would be provisioned based on the license and subscription.
  • 34. This can be associated along with both the Private or Public cloud. An Agent would be deployed in the client organization’s premises or data center to access the endpoints. This would avoid creating the Site to Site tunnel between the Client Data Center and the ChainSys Cloud Data Center. There is a proxy (Apache HTTPD Server) on both sides, the ChainSys Data Center and the Client Data Center. All the back and forth communications between the ChainSys Data Center and the Agent are routed through the proxy only. The ChainSys cloud sends instructions to the Agent to start a Task along with the Task information. The Agent executes the Task and sends back the response to the cloud with the Task’s status. The Agents (for dataZap, dataZense, and Smart BOTS) would be deployed. For dataZap, we can use the existing database (either PostgreSQL or Oracle) for the staging process. The Agent executes all integration and migration tasks by connecting directly to the source and target systems, validating and transforming data, and transferring data between them. For dataZen and Smart App Builder, data would be streamed to the Chainsys Platform to manipulate the data. Data Center APACHE HTTPD Server DMZ Services VM OutSide World Datamart End Points DATABASE Executable Analytics Executable Catalog Executable Executable Web Nodes Apache Tomcat 9 Collaborate Server API Gateway DMZ Nodes single sign on APACHE HTTPD Server Application Clustvaer Nodes Data Store Analytics Services Catalog Services Design & Process Schedular Node File / Log Server Application Deployment Node Layout Build Data Mart DATABASE Indexing Store Caching Node Metadata Store DATABASE Versioning Store App Data Store Apache relax Client Data Centre Hybrid Cloud Deployment Hybrid Cloud
  • 35. Disaster Recovery All the application nodes and the web nodes would be replicated using the RSYNC. The specific install directory and any other log directories would be synced to the secondary replication nodes. For PostgreSQL, the Streaming replication feature would be used, which used the archive log shipping. SOLR comes up with the in-built CDCR (Cross Data Center Replication) feature, which can be used for disaster recovery. CouchDB has an outstanding replication architecture, which will replicate the primary database to the secondary database. The RPO can be set to as per the needs individually for both Applications and Databases The RTO for the DR would be approximately an hour. RSYNC Streaming Replication Archive Log Ship CDC Replication Replication Primary Application & DB Application Nodes PostgreSQL Nodes SOLR Nodes CouchDB Nodes Secondary Application & DB Application Nodes PostgreSQL Nodes SOLR Nodes CouchDB Nodes
  • 36. Various data collection methods and protocols Start to monitor all metrics instantly by using out-of-the-box templates Flexible trigger expressions and Trigger dependencies Proactive network monitoring Remote command execution Flexible notifications Integration with external applications using Zabbix API Pure Cloud Deployment ChainSys Platform is available on the cloud. The Platform has been hosted as a Public Cloud and also has the Private Cloud options. Application Monitoring ChainSys uses third party monitoring open-source tools uch as Zabbix and Jenkins to monitor all the nodes. Zabbix supports tracking the Servers’ availability and performance, Virtual Machines, Applications (like Apache, Tomcat, ActiveMQ, and Java), and Databases like PostgreSQL, Redis, etc.) that are used in the Platform. Using Zabbix, the following are achieved Third-Party Monitoring Tools We can also use the individual application monitoring systems for more in-depth analysis but having an integrated approach to looking into the problems helps us be proactive & faster.
  • 37. Single Node Single Node does not mean that literally. Here we would say that all application services produced by the ChainSys Platform are deployed in a Single Node or Server. The rest of the data storage nodes are separate servers or nodes. This type of installation would generally be for a patching environment where there are not too many operations. These would also be recommended for non-mission critical development activities where high availability and scalability are not a determining factor. In-Built Monitoring System ChainSys is working on its Application Monitoring tool that monitors the necessary parameters like the CPU / Memory. This tool is also planned to help monitor individual threads within the application. It is also intended to do most maintenance activities like patching, cloning, and database maintenance from one single toolset. This will be integrated with Zabbix for monitoring and alerting systems.
  • 38. Supported Endpoints ( Partial ) Oracle Sales Cloud, Oracle Marketing Cloud, Oracle Engagement Cloud, Oracle CRM On Demand, SAP C/4HANA, SAP S/4HANA, SAP BW, SAP Concur, SAP SuccessFactors, Salesforce, Microsoft Dynamics 365, Workday, Infor Cloud, Procore, Planview Enterprise One Windchill PTC, Orale Agile PLM, Oracle PLM Cloud, Teamcenter, SAP PLM, SAP Hybris, SAP C/4HANA, Enovia, Proficy, Honeywell OptiVision, Salesforce Sales, Salesforce Marketing, Salesforce CPQ, Salesforce Service, Oracle Engagement Cloud, Oracle Sales Cloud, Oracle CPQ Cloud, Oracle Service Cloud, Oracle Marketing Cloud, Microsoft Dynamics CRM Oracle HCM Cloud, SAP SuccessFactors, Workday, ICON, SAP APO and IBP, Oracle Taleo, Oracle Demantra, Oracle ASCP, Steelwedge Oracle Primavera, Oracle Unifier, SAP PM, Procore, Ecosys, Oracle EAM Cloud, Oracle Maintenance Cloud, JD Edwards EAM, IBM Maximo OneDrive, Box, SharePoint, File Transfer Protocol (FTP), Oracle Webcenter, Amazon S3 HIVE, Apache Impala, Apache Hbase, Snowflake, mongoDB, Elasticsearch, SAP HANA, Hadoop, Teradata, Oracle Database, Redshift, BigQuery mangoDB, Solr, CouchDB, Elasticsearch PostgreSQL, Oracle Database, SAP HANA, SYBASE, DB2, SQL Server, MySQL, memsql IBM MQ, Active MQ Java, .Net, Oracle PaaS, Force.com, IBM, ChainSys Platform Oracle E-Business Suite, Oracle ERP Cloud, Oracle JD Edwards, Oracle PeopleSoft, SAP S/4HANA, SAP ECC, IBM Maximo, Workday, Microsoft Dynamics, Microsoft Dynamics GP, Microsoft Dynamics Nav, Microsoft Dynamics Ax, Smart ERP, Infor, BaaN, Mapics, BPICS Cloud Applications PLM, MES & CRM HCM & Supply Chain Planning Project Management & EAM Enterprise Storage Systems Big Data No SQL Databases Databases Message Broker Development Platform Enterprise Applications
  • 39. One Platform for your Data Management needs End to End www.chainsys.com Data Migration Data Reconciliation Data Integaration Data Quality Management Data Governance Analytical MDM Data Analytics Data Catalog Data Security & Compliance