Data Integration Platform More Than A Comprehensive ETL Suite

What is ETL?

ETL Diagram
ETL stands for Extract, Transform and Load which describes three distinctive functions.
It is often referred to as a software that combines the functions into one single programming tool.

Or sometimes described as processes that do what they represent. In a sentence,
ETL is a software tool or a process that performs extracting data from one or more data sources, transforming and writing it to or loading it into other data store.

As IT infrastructure gets more complex and the data it produces becomes bigger, it became unable to fully describe or cover the needs of modern data integration, data migration requirement with traditional ETL concept.

The following describes what the ETL functions have been for.

Extract
the process of extracting,reading or getting data from a variety of data source like database, file, stream or even programmatically generated memory data.
Transform

the process of converting the extracted data from its previous form into the one it needs to be in so that it can be transferred or stored into other data store. Generally, transformation are done by using defined rules. These rules can be for simply migrating data from one source to another or for building up a whole new data by combining multiple data sources with complex converting business logic.

Load
the process of writing, sending the transformed data to the output target such as database, file, piped stream etc.
The above functions have formed basic building blocks of every ETL solution and every Data Migration/Integration requirement describes at least these functions. But this does not mean that the traditional ETL is what we now agree to be enough or satisfying for our data integrating and migrating software. We need to perform Extracting and Loading against multiple heterogenous relational, hierachical or object database, distributed big data system, software application etc. And modern ETL solutions need to be capable of doing these operations under real time, scheduled basis or sometimes in an arbitrary basis during a defined period of time. Transform rules are not limited to be a certain set of defined rules, they should now cover almost all possible extracting, transforming, loading and even business logics that requires nested groups of these functions.
Data Sync Manager - An ETL Solution?
DSM Architecture Data Sync Manager™ (DSM) is the outcome of our unique approach to accommodate not only traditional ETL functions but also demanding requirement of modern data integration, analytics and business intelligence projects. Throughout data integration projects of various IT organizations, we have been able to prove DSM can seamlessly handle data integration, migration and BI analytics problems. Scalability and cost-effectiveness have not been considered in conventional ETL, which becomes important factors in a distributed and multi hetero-geneous system environment. Even though, it is possible to process the Big Data using the conventional ETL tool previously mentioned, there are tools designed to fit into it like Hadoop which provides more cost-effectiveness, scalability while dealing with storages of massive information in a variety of forms of data. From ETL's perspective,
DSM can be defined as a Data Integration/Migration Platform with comprehensive set of tools for Extract, Transform and Load processes which forms the backbone of all DW(data warehouse) tools including integrated support for Big Data.

80+ Built-in Components

Big Data - Hadoop
Database - DW Query, Sync multi DB etc.
Archiving & Compression
File & Memory
Networking - FTP, SSH, URL etc.
LDAP
SAP
Verification
Notification - Mail
Memory Database
Mappers - Database, Memory
Executors - Remote, local, shell executors
and more ...

DSM Features

Collaborative development
Allows multiple clients can access authorized development workspace simultaneously
Support for Big Data
15+ built-in Hadoop components for HBASE and HDFS
Realtime monitoring
Support both GUI and Console real time monitoring
Automatic Documentation
Generates document based on project definition
Input/Output
Every component has its own Input and output whenever possible, thus defining rules for process is as easy as assembling Lego blocks
Support function level transaction control
Support Sequential and Parallel process execution
Programmatic/Manual intervention of each function
Users can define any variation of predefined functions via programmatic access