sap bods training material pdf ? Call Mr Data Amsterdam. sap data services 4.2 training sap bods architecture, sap bods certification cost, data warehouse concepts in sap bods.
What is SAP BODS?
SAP BODS, stands for System as a Product (SAP). Business Objects Data Services (name of the tool). BODS is a ETL Tool. ETL stands for Extract, Transform and Loading.
ETL is much easier and faster to use when compared to the traditional methods of moving data which involve writing conventional computer programs. ETL toolscontain graphical interfaces which speed up the process of mapping tables and columns between the source and target databases
Example SAP ERP Database
Different source systems
Problems differents data systems in one organisation
The Challenges of Data Integration
Integrating disparate data has always been a difficult task, and given the data explosion occurring in most organizations, this task is not getting any easier. Over 69% of respondents to our survey rated data integration issues as either a very high or high inhibitor to implementing new applications. The three main data integration issues (see Figure 1) listed by respondents were data quality and security, lack of a business case and inadequate funding, and a poor data integration infrastructure.
Top Data Integration Issues
Figure 1. The top inhibitors to the success of data integration projects. Respondents were asked to select up to three. Based on 672 respondents.
Characteristics of Data Integration
Data integration involves a framework of applications, techniques, technologies, and products for providing a unified and consistent view of enterprise business data (see Figure 2).
- Applications are custom-built and vendor-developed solutions that utilize one or more data integration products.
- Products are off-the-shelf commercial solutions that support one or more data integration technologies.
- Technologies implement one or more data integration techniques.
- Techniques are technology-independent approaches for doing data integration.
Data Integration Framework
Figure 2. Components of a data integration solution.
Following is a review of the techniques and technologies used in data integration projects.
Data Integration Techniques
There are three main techniques used for integrating data: consolidation, federation, and propagation.
Data Consolidation captures data from multiple source systems and integrates it into a single persistent data store. This data store may be used for reporting and analysis as in data warehousing, or it can act as a source of data for downstream applications as in an operational data store.
With data consolidation, there is usually a delay, or latency, between the time updates occur in source systems and the time those updates appear in the target store. Depending on business needs, this latency may be a few seconds, several hours, or many days. The term near real time is often used to describe target data that has a low latency of a few seconds, minutes, or hours. Data with zero latency is known as real-time data, but this is difficult to achieve using data consolidation.
Data Federation provides a single virtual view of one or more source data files. When a business application issues a query against this virtual view, a data federation engine retrieves data from the appropriate source data stores, integrates it to match the virtual view and query definition, and sends the results to the requesting business application. By defi- nition, data federation always pulls data from source systems on an on-demand basis. Any required data transformation is done as the data is retrieved from the source data files. Enterprise information integration (EII) is an example of a technology that supports a federated approach to data integration.
Data Propagation applications copy data from one location to another. These applications usually operate online and push data to the target location; i.e., they are event-driven. Updates to a source system may be propagated asynchronously or synchronously to the target system. Synchronous propagation requires that updates to both source and target systems occur in the same physical transaction. Regardless of the type of synchronization used, propagation guarantees the delivery of the data to the target. This guarantee is a key distinguishing feature of data propagation. Most synchronous data propagation technologies support a two-way exchange of data between a data source and a data target. Enterprise application integration (EAI) and enterprise data replication (EDR) are examples of technologies that support data propagation.
A Hybrid Approach. The techniques used by data integration applications will depend on both business and technology requirements. It is quite common for a data integration application to use a hybrid approach that involves several data integration techniques.
Data Integration Technologies
A wide range of technologies are available for implementing the data integration techniques outlined above. This section reviews four of the main ones: extract, transform, and load (ETL); enterprise information integration (EII); enterprise application integration (EAI); and enterprise data replication (EDR). Master data management (MDM) and customer data integration (CDI), which are really data integration applications, are also discussed because they are often thought of as data integration technologies.
Extract, Transform, and Load
As the name implies, ETL technology extracts data from source systems, transforms it to satisfy business requirements, and loads the results into a target destination. Sources and targets are usually databases and files, but they can also be other types of data stores such as a message queue. ETL supports a consolidation approach to data integration.
Data can be extracted in schedule-driven pull mode or event-driven push mode. Both modes can take advantage of changed data capture. Pull mode operation supports data consolidation and is typically done in batch. Push mode operation is done online by propagating data changes to the target data store.
Data transformation may involve data record restructuring and reconciliation, data content cleansing, and/or data content aggregation. Data loading may cause a complete refresh of a target data store or may be done by updating the target destination. Interfaces used here include de facto standards like ODBC, JBDC, JMS, for example, or native database and application interfaces.
In our survey, 57% of respondents rated their batch ETL usage as high (see Figure 3). Adding a medium rating to the result increases the figure to 81%. The survey also asked what the likely usage of batch ETL will be in two years. The result was 58% for high usage, and 82% for high and medium. As expected, these figures demonstrate that the batch ETL market has flattened out because most organizations use it.
ETL Use in Organizations
Figure 3. Batch ETL use is flat, but changed data capture and online ETL use will grow over the next two years. Based on 672 respondents.
The picture changes when looking at the growth figures for changed data capture (CDC) and online ETL operations. Our survey shows 16% of respondents rated their usage of CDC in ETL today as high. This number grows to 36% in two years. The equivalent figures for online ETL (called real-time or tricklefeed ETL in the survey) were 6% and 23%, respectively. These growth trends are due primarily to shrinking batch windows and the increasing need for lowlatency data. It is interesting to note that combining the high and medium usage figures for the two-year projection of online ETL gives a result of 55%. This clearly shows the industry is moving from batch to online ETL usage.
Enterprise Information Integration
EII provides a virtual business view of dispersed data. This view can be used for demand-driven query access to operational business transaction data, a data warehouse, and/or unstructured information. EII supports a data federation approach to data integration.
The objective of EII is to enable applications to see dispersed data as though it resided in a single database. EII shields applications from the complexities of retrieving data from multiple locations, where the data may differ in semantics and formats, and may employ different data interfaces.
Distinguishing features to look for when evaluating EII products include the data sources and targets supported (including Web services and unstructured data), transformation capabilities, metadata management, source data update capabilities, authentication and security options, performance, and caching.
In our survey, 5% of respondents rated their EII use as high (see Figure 4). Adding a medium rating to the result increases the figure to 19%. These figures grow to 22% and 52% respectively in two years, indicating considerable interest in exploiting EII technology in the future.
EII Use in Organizations
Figure 4. EII use is low at present but its usage is likely to grow rapidly. Based on 672 respondents.
Enterprise Application Integration
EAI integrates application systems by allowing them to communicate and exchange business transactions, messages, and data with each other using standard interfaces. It enables applications to access data transparently without knowing its location or format. EAI is usually employed for real-time operational business transaction processing. It supports a data propagation approach to data integration.
The direction of the EAI industry is toward the use of an enterprise service bus (ESB) that supports the interconnection of legacy and packaged applications, and also Web services that form part of a service-oriented architecture (SOA).
From a data integration perspective, EAI can be used to transport data between applications and to route real-time event data to other data integration applications like an ETL process. Access to application sources and targets is done via Web services, Microsoft .NET interfaces, Java-related capabilities such as JMS, legacy application interfaces and adapters, etc.
EAI is designed to propagate small amounts of data from one application to another. This propagation can be synchronous or asynchronous, but is nearly always done within the scope of a single business transaction. In the case of asynchronous propagation, the business transaction may be broken down into multiple physical transactions. An example would be a travel request that is broken down in separate but coordinated airline, hotel, and car reservations.
Data transformation and metadata capabilities in an EAI system are focused toward simple transaction and message structures, and they cannot usually support the complex data structures handled by ETL products. In this regard, EAI does not compete with ETL.
In our survey, 9% of respondents rated their EAI usage as high (see Figure 5). Adding a medium rating increases the figure to 29%. These figures grow to 26% and 58% respectively in two years. It is important to point out that the question relates to the use of EAI for data integration, as opposed to the use of EAI in the organization overall. The two-year EAI projection of 58% is consistent with the 55% growth figure for online ETL use mentioned earlier. This suggests that organizations see the need to merge the event-driven benefits of EAI with the transformation and consolidation power of ETL.
EAI Use in Organizations
Figure 5. EAI growth is consistent with the growth in online ETL use shown in Figure 3. This suggests the two technologies will be used together. Based on 672 respondents.
This study looked at data integration approaches across a wide range of different companies and applications. The study results show that these companies fall into two main groups:
- Large organizations that are moving toward building an enterprisewide data integration architecture. These companies typically have a multitude of data stores and large amounts of legacy data. They focus on buying an integrated product set and are interested in leadingedge data integration technologies. These organizations also buy highperformance best-of-breed products that work in conjunction with mainline data integration products to handle the integration of large amounts of data. They are also more likely to have a data integration competency center.
- Medium-sized companies that are focused on data integration solely from a business intelligence viewpoint and who evaluate products from the perspective of how well they will integrate with the organization’s BI tools and applications. These companies often have less legacy data, and are less interested in leading-edge approaches such as right-time data and Web services.
In evaluating and applying the contents of this report, it is important to understand which of the two categories your company fits into, and thus how sophisticated a data integration environment your company needs. Nonetheless, many of the ideas and concepts presented in this report apply equally to all companies, regardless of size. The main message of this report is that data integration problems are becoming a barrier to business success, and your company must have an enterprisewide data integration strategy if it is to overcome this barrier.
data warehouse concepts in sap bods
SAP Data Services and associated best practice
Prior to introduction of SAP S/4HANA Migration Cockpit to on-premise world, SAP Data Services were the only tool recommended and fully supported for the purposes of data migration to SAP S/4HANA (on-premise). Furthermore, we have also built and been delivering associated accelerators in the form of Best Practice for “rapid data migration to SAP S/4HANA (on premise)” available at https://rapid.sap.com/bp/#/browse/packageversions/RDM_S4H_OP (always check for latest version of the BP).
This best practice is built on the capabilities of SAP Data Services, the market leading data integration tool with full data quality capabilities. The logical architecture and scope of this solution is depicted on the diagram below:
The content delivered with Best Practice includes:
- Detailed documentation for technical set-up, preparation and execution of migration for each supported object and extensibility guide.
- SAP Data Services (DS) files, including IDoc status check and Reconciliation jobs.
- IDoc mapping templates for SAP S/4HANA (in MS Excel).
- Lookup files for Legacy to SAP S/4HANA value mapping.
- Migration Services Tool for value mapping and lookup files management.
- WebI Reporting content for SAP BusinessObjects BI platform.
And before we proceed, it is worth distinguishing two use cases of this best practice depending on your requirements and license in place. As per Note 2239701 – SAP Rapid Data Migration for SAP S/4HANA, on premise edition:
If you own either a runtime (REAB) or full use SAP HANA license, this includes a limited use license of SAP Data Services software restricted to loading data into SAP HANA (called Data Integrator license). This fills the minimum requirement for the SAP Rapid Data Migration to SAP S/4HANA content which includes full ETL (Extract, Transform, and Load) used to extract data from heterogeneous source systems, the transformation and mapping, the validation and the data load.
In other words, the limited use license allows you to take advantage of all scope items in the diagram above. Additional licenses may be useful for two specific extensions to the core functionality – advanced data profiling and advanced data cleansing.
With regards to advanced data cleansing there is a dedicated job to standardise, cleanse, match and de-duplicate Business Partner names and addresses, which uses Data Quality transforms and thus requires the full Data Services license.
The SAP Information Steward comes into the picture to provide advanced data profiling capabilities based on its Data Insight that includes the ability to run extensive types of profiling like:
- Columns – used to examine the values and characteristics of data elements such as minimum, maximum, median, distribution of words, and so on.
- Address – used to determine the quality of an address. This sends data through the address cleansing transforms in Data Services to identify those that are valid, correctable, or invalid
- Dependency – used to identify attribute-level relationships in the data. For example, for each state you can find the corresponding city name
- Redundancy – used to determine the degree of overlapping data values or duplication between two sets of columns
- Uniqueness – returns the count and percentage of rows that contain nonunique data for the set of columns selected.
To take advantage of the capabilities listed above, respective license is required. Henceforward we will focus on capabilities included in the Data Integrator package.
SAP Data Services use IDocs to post data into SAP S/4HANA, therefore respective configuration on the SAP S/4HANA target system is a common requirement for all migration objects. There is a custom program delivered to create required partner profiles for each required message type (refer to building block “Data Migration IDoc Config Guide (W01)”).
Now, we do obviously need to deploy SAP Data Services and associated requirements as well as pre-delivered migration content – this is documented in “RDM_S4H_OP_DS42V2_Quick_Guide_EN_XX” attached to Note 2239701. Worth noting that there are two separate versions of this manual – one for Windows and one for Linux based deployment with support for underlying repository database platforms varying between these deployments. Also, the document may only detail set-up steps for one selected DB – for other supported DB platforms, refer to standard SAP Data Services documentation.
Once standard set-up is complete, delivered content needs to be applied – from that point forward, the platform is ready for preparation and execution of data migration for selected objects.
When sourcing data from your legacy system, you will have a choice of using flat files (text or XLS) as intermediary or to connect directly to your source system or database.
With the documentation and content delivered via referenced Note, you can deliver your data migration project yourself. But, SAP does have packaged service to deliver the described scope. You can get more details from https://rapid.sap.com/bp/#/browse/packageversions/RDM_S4H_OP > Accelerators > Customer presentation. The service has flexible scope and experienced team to deliver. This can be particularly interesting when you do not operate SAP Data Services in your environment and do not have the necessary skills available.
Closing remarks and considerations
The two toolsets and methods for migrating your data described above are quite different when it comes to the tooling, capabilities and associated effort. The matrix below attempts to summarise and compare key aspects of each.
|SAP S/4HANA Migration Cockpit (MC) and SAP S/4HANA Migration Object Modeler (MOM)||Rapid Data Migration with SAP Data Services (DS)|
|Technical deployment||Built into the SAP S/4HANA 1610 and later||Separate deployment and set-up necessary for SAP Data Services, BI Platform and optionally Information Steward.|
|Commercial aspects||Capability provided as part of SAP S/4HANA license.||Core capability included in selected SAP HANA licenses. Advanced functionality (for data cleansing) requires full SAP Data Services license.|
|Data extraction methods||File-based load supported only at this stage.||File-based as well as direct load from source system/database.|
|Delivery method||Best practice documentation and built-in migration object templates delivered as part of the solution.||Best practice documentation and built-in migration object templates delivered as part of the solution.
Also available as packaged service from SAP.
|Extensibility||Extensibility using MOM allows to go beyond reliance on standard content in Migration Cockpit.||Full extensibility using standard SAP Data Services capabilities.|
|Data quality support||None other than data validation during (simulation) posting.||Yes, but requires SAP Data Services license.|
|Scope of supported migration objects||Decided not to attempt to compare the scope in this blog as it tends to change quite rapidly and depends (especially in case of MC/MOM) on the target SAP S/4HANA version. It is fair to say that up to certain point, SAP Data Services has numerical (number of supported objects) advantage, but this is rapidly changing. And with both toolsets supporting custom build of migration objects and scenarios, sky is the limit…|
More details you can find on internet: