Overview
Data Sync Manager ™(DSM) is a high performance Data Integration and Migration Platform that performs and manages ETL functions over data warehouse or Big Data, and it provides monitoring, notification, scheduling, logging services. Simply put, DSM can be regarded as a Big Data savvy ETLM or ETL Management System. As DSM is based on Java™, it runs on Windows, Mac, linux or any OS that supports java runtime. Architecture Image DSM is composed of three independent layers of our proprietary software, which are DSM Server™(Data Sync Manager Server), FrameBuilder™ and packages of ETL and Big Data component. DSM Server as a transaction middleware provides services required for DSM and FrameBuilder as a runtime container(engine) for the components form the basis of DSM platform. DSM Server and FrameBuilder have been proved their versatility in various projects we had implemented by providing solid transaction infrastructure in addition to the other backend services inherent in any web application server from DSM Server, and WYSISYG based GUI building capability from FrameBuilder. On top of these solid foundations, DSM could have seamlessly built rock solid ETL components as well as various handlers for handling modern Big Data.
Benefits over other ETL solutions
Major benefits of DSM over other ETL solutions are 1) Data Processing time decrease of approximately 45% to 90% and 2) development time decrease of up to 90%.
Before
  • Large amount of time required to develop program
  • Complexity of data coupled
  • Hard to share meta data
  • Difficult to required job for scheduled jobs
  • Hard to verify integrity of source data
  • Hard to spot error during each step of job
  • Lowered maintenance cost
After
  • Easy, intuitive, streamlined development environment
  • Easy to share meta data
  • Easy to perform sequential/parellel execution of scheduled jobs
  • Easy to monitor/manage job log and error
  • Easy to update/modify program
  • Hard to spot error during each step of job
  • Increased maintenance cost
DSM Basics
In DSM, all ETL programs break down into Project, Schedule, Process and Functions. Function which is an atomic unit of program while a group of functions form a process. A group of processes belongs to a schedule which controls the scheduling of its child processes, then a project can have one or more schedules in it as shown below.
Project
  • DSM Workspace can maintain multiple Projects
  • Projects changes are logged
  • Projects can declare global properties which are automatically inherited through its children such as schedules, processes and functions
  • Documentation generation are done per project
Schedule
  • Unit of execution
  • Schedules can be scheduled or be manually executed by user
  • Schedules can be executed through external system by implementing an interface
  • Success or error during execution can be handled by defining scriptlets
  • Schedules can be set to be retried in the event of error
Process
  • Process has its own properties and check its child function properties
  • Database isolation level and transaction properties are set against process
  • Java and Groovy script language can be used for defining details of transforming rule
  • Success or error during execution can be handled by defining scriptlets
Functions
  • Function has its properties
  • 87 pre built functions are provided in the form of reusable component
  • Custom function can be added by implementing DSM component plug-in interface
  • Support Big Data and Haddop