HomeArticlesSlideshowsAsk BobLogin or Register
NewsCustom SearchContactLinks
 Class:   Search:   for:      |<  <  > 
SIMPLELOAD - Elementary 'last phase' solution for ETL
    by Ludek Bob Jankovsky, 31-May-2017 (ETL PATTERNS)
Despite on poorest of the solution in comparison to true metadata driven ETL, I often meet transformations realized as views with almost the same format as the target table, followed by the pattern based implementation of the final stage - loading the transformed source data into target table, considering current data and maintaining history.
I have to admit the smart approach is proper for small and quick win solutions.
For easy implementation I scripted a skeleton of the last phase load element based on data dictionary information, supporting the most common patterns. ...more
METASWAMP - XML schema
    by Ludek Bob Jankovsky, 01-Dec-2016 (ETL PATTERNS)
METASWAMP is a set of metadata driven pattern and repository requirements implemented primarily as an XML schema. It is intended to serve as an interface between ETL patterns and Repository storage(s). ...more
Real-time ETL patterns and how to piss to pit
    by Ludek Bob Jankovsky, 16-Mar-2014 (ETL PATTERNS)
Recently I published an article about real-time ETL. Designing it we hit on several challenges and various different approaches. To be certain, we speak about Real-time ETL realizing Data integration of sources, not a Service integration. We have to react on smallest changes in source data and propagate the changes to transformed target Data structure. Advantages and disadvantages I described in previous article. ...more
Edges of Real-time ETL
    by Ludek Bob Jankovsky, 11-Jan-2014 (ETL PATTERNS)
Nowadays Real-time and message based ETL take more significant place in IT infrastructure. Differently from Messaging based operational data, in the Real-time approach target data are refreshed based on changes of source data in source systems. That uses means what was earlier often used for replication of data. Differently from replication, realtime ETL is hardly 1:1 but the transformation takes place in the process. Real-time ETL brings many limitation in comparison to standard batch ETL. ...more
Partitioning sreategy patterns - principal element of Data architecture
    by Ludek Bob Jankovsky, 19-Aug-2011 (ETL PATTERNS)
One of basic chapters in data architecture is a decision about partitioning strategy for particular data layers, table stereotypes and ETL patterns.
Following article will describe basic drivers influencing decisions about partitioning strategy. Details on particular patterns will be linked along.  ...more
Failover technics - validation during the load phase
    by Ludek Bob Jankovsky, 09-May-2010 (ETL PATTERNS)
There are several places within the ETL process to catch wrong and low quality data. Generally there are four basic moments:
  • Validation during the load phase
  • Validation during the transformation phase
  • Limited validation during transformation phase just defaulting invalid values
  • Ex-post validation - data quality management

Following article is about the first one. ...more
Rigid Exchange Partition Data Acquisition approach
    by Ludek Bob Jankovsky, 05-Feb-2010 (ETL PATTERNS)
There are several ways to support consistency of data warehouses and data marts within business hoursperiod. In standard simple approaches it usually suppresses loading time window to really short period between readiness of source data and required readiness of target data. Following approach is just another way allowing keep all data in the mart consistent at the moment, anyway I consider it as an interesting stuff. ...more
How to avoid wasting sequences in differential merge pattern
    by Ludek Bob Jankovsky, 25-Feb-2009 (ETL PATTERNS)
In Oracle 10g, 11g getting values from sequences during MERGE behaves a bit strange. Despite you define getting value of sequence directly in a WHEN NOT MATCHED clause, sequence is acquired and wasted even if record is matched. According to docomented behaviour sequence will be acguired for every source record. It is a bad message because of offer usage of mentioned construction in DIFF_MERGE pattern.
Lets see what to do with it. ...more
Metadata driven ETL approach
    by Ludek Bob Jankovsky, 10-Sep-2008 (ETL PATTERNS)
It could be named "Dream about metadata driven ETL". The schema depicted all elements what should participate in the ETL generating process separately to keep the system high manageable. ...more
Unpivot ETL transformation pattern - 10g vs. 11g
    by Ludek Bob Jankovsky, 09-Sep-2008 (ETL PATTERNS)
Unpivot transformation (several columns into several rows) becomes much simplier with Oracle 11g. Lets see. ...more
SELF_UPDATE ETL Transformation pattern
    by Ludek Bob Jankovsky, 02-Sep-2008 (ETL PATTERNS)
Self update patterrn is one of simpliests patterns. It performs updates based just on record data, such as closing record fulfilling specified conditions etc. ...more
CREEPING_DEATH algorithm ETL Transformation pattern
    by Ludek Bob Jankovsky, 02-Sep-2008 (ETL PATTERNS)
One of special patterns related to cool specific way of maintaining Slowly changing dimensions of type 2 (valid_from, valid_thru). Differently of "standard" way obsolete records are terminated just based on existence of newer record for the same instance. Module based on the pattern is independent on source of transformation (common for all mappings filling table) and much more efficient in bulk processing. ...more
FULL_REFRESH_DELETE ETL Transformation pattern
    by Ludek Bob Jankovsky, 31-Aug-2008 (ETL PATTERNS)
ETL Pattern means standard functional task defining what has to happen despite technical implementation.
FR_DELETE pattern task marks as deleted all records missing in a source. ...more
INSERT ETL transformation patterns family
    by Ludek Bob Jankovsky, 28-Aug-2008 (ETL PATTERNS)
ETL Pattern means standard functional task defining what has to happen despite technical implementation.
Basic insert-wise patterns:
  • INSERT - just inserts records from source (simpliest)
  • DIFF_INSERT - inserts new records only, existing records (based on matching key) are ignored
  • SCD2_INSERT - inserts new and changed records. Changed records are inserted as duplicates. That way supports load of SCD2 (slowly changing dimensions).
 ...more
DIFF_MERGE - Differential merge ETL transformation pattern
    by Ludek Bob Jankovsky, 28-Aug-2008 (ETL PATTERNS)
ETL Pattern means standard functional task defining what has to happen despite technical implementation.
DIFF_MERGE pattern task:
  • inserts new elements
  • updates changed existing elements
  • ignores unchanged existing elements
 ...more
Articles: 1 .. 15   first  prev  next 
All Right Reserved © 2007, Designed by Bob Jankovsky