- Metabase Oracle Connection Control
- Metabase Oracle Connection Control
- Metabase Oracle Connection Manager
- Metabase Oracle Connection Login
This chapter describes how to work with Data Quality Products in Oracle Data Integrator.
This chapter includes the following sections:
- V0.38.4 / Users Guide / 13 Sql Parameters SQL Parameters. Metabase has the flexible ability to allow variables in native (SQL) queries. This lets you dynamically replace values in your queries using filter widgets or through the query’s URL.
- Contents LoadRunnerEnterprise 1 Welcometothisguide 7 Part1:Installationoverview 8 Chapter1:Beforeyouinstall 9 LoadRunnerEnterprisecomponentsanddataflow 10.
Working with Oracle in Metabase Starting in v0.20.0, Metabase provides a driver for connecting to Oracle databases. Under the hood, Metabase uses Oracle's JDBC driver; due to licensing restrictions, we can't include it as part of Metabase.
Metabase Oracle Connection Control
15.1 Introduction to Oracle Data Quality Products
Oracle Data Profiling and Oracle Data Quality for Data Integrator (also referred to as Oracle Data Quality Products) extend the inline Data Quality features of Oracle Data Integrator to provide more advanced data governance capabilities.
A complete Data Quality system includes data profiling, integrity and quality:
Profiling makes possible data investigation and quality assessment. It allows business users to get a clear picture of their data quality challenges, to monitor and track the quality of their data over time. Profiling is handled by Oracle Data Profiling. It allows business users to assess the quality of their data through metrics, to discover or infer rules based on this data, and finally to monitor over time the evolution of the data quality.
Integritycontrol is essential in ensuring the overall consistency of the data in your information system's applications. Application data is not always valid for the constraints and declarative rules imposed by the information system. You may, for instance, find orders with no customer, or order lines with no product, and so forth. Oracle Data Integrator provides built-in working environment to detect these constraint violation and store them for recycling or reporting purposes. Static and Flow checks in Oracle Data Integrator are integrity checks.
Quality includes integrity and extends to more complex quality processing. A rule-based engine apply data quality standards as part of an integration process to cleanse, standardize, enrich, match and de-duplicate any type of data, including names and addresses. Oracle Data Quality for Data Integrator places data quality as well as name and address cleansing at the heart of the enterprise integration strategy.
15.2 The Data Quality Process
The data quality process described in this section uses Oracle Data Quality products to profile and cleanse data extracted from systems using Oracle Data Integrator. The cleansed data is also re-integrated into the original system using Oracle Data Integrator.
The Quality Process has the following steps:
Create a Quality Input File from Oracle Data Integrator, containing the data to cleanse.
Create an Entity in Oracle Data Quality, based on this file.
Create a Profiling Project to determine quality issues.
Create a Oracle Data Quality Project cleansing this Entity.
Export the Data Quality Project for run-time.
Reverse-engineer the Entities using the RKM Oracle Data Quality.
Run this Quality Project from Oracle Data Integrator using the OdiDataQuality tool.
Sequence the Process in a Package.
15.2.1 Create a Quality Input File
Oracle Data Quality uses as a source for the Quality project a flat file which contains the data to cleanse. This Quality input file can be created from Data Integrator and loaded from any source datastore using interfaces. This file should be a FILE datastore with the following parameters defined on the Files tab:
Parameter | Value |
---|---|
File Format | Delimited |
Heading (Number of Lines) | 1 |
Record Separator | MS-DOS |
Field Separator | Other |
[Field Separator] Other | ,(comma sign - Hexadecimal 2C) |
Text Delimiter | ' (double quotation marks) |
Decimal Separator | empty, not specified |
For more information on creating a FILE datastore, refer to the Chapter 5, 'Creating and Reverse-Engineering a Model'. For more information on loading flat files, see 'Files' in the Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data Integrator.
15.2.2 Create an Entity
To import a data source into Oracle Data Quality for Data Integrator means to create an entity based on a delimited source file.
15.2.2.1 Step 1: Validate Loader Connections
Your administrator must set up at least one Loader Connection when he or she installs Oracle Data Quality for Data Integrator. This Loader Connection is used to access the Oracle Data Quality input file. As the input file is a delimited file, this Loader Connection should be a Delimited Loader Connection. Step 1 requires you validate this Delimited Loader Connection set up. Also verify that all the data and schema files you need are copied to the directory defined by the Loader Connection.
If you do not have access to the Metabase Manager, ask your Metabase administrator to verify the Loader Connection for you.
If you are a Metabase User and have access to the Metabase Manager, follow this procedure:
To validate a Loader Connection
Open the Metabase Manager (Start > All Programs > Oracle > Oracle Data Profiling and Quality > Metabase Manager).
Verify you are in Admin Mode.
Expand the Control Admin node.
Double-click Loader Connections.
On the right, the Loader Connections list view displays each Loader Connection, showing its name, type, data file, and parameters. Review the information to verify that the Loader Connection created by your administrator is a Delimited Loader Connection and that the data and schema directories are pointing to the correct location.
Note:
If you are a Metabase User with full Metabase privileges, you can create a new Loader Connection.15.2.2.2 Step 2: Create Entity and Import Data
Use the Create Entity wizard to create an Entity. The Wizard takes you through each step, helps you to select data to load, and provides an interface for specifying connection and schema settings. It also gives you options for customizing how the data appears in an Entity.
To import a delimited source file into Oracle Data Quality for Data Integrator:
Copy the flat file that you want to import into Oracle Data Quality for Data Integrator into the data directory that you specified when you defined the Loader Connection.
Click on the Windows Start menu and select All Programs > Oracle > Oracle Data Profiling and Quality > Oracle Data Profiling and Quality.
Log in the user interface with your metabase user. The Oracle Data Profiling and Quality user interface opens
From the Main menu, select Analysis >Create Entity…
The Create Entity wizard opens in the upper right pane.
On the Connection Page of the Create Entity wizard, select the Loader Connection given to you by the administrator that you have checked in Step 1.
Leave the default settings for the filter and the connection and click Next.
Oracle Data Quality connects to the data source using the Loader Connection you selected in Step 4. If the connection fails, contact your Metabase Administrator
In the Entity Selection dialog, select the data source file name you want to import in the list and click Next.
Select the schema settings for the selected data file corresponding to the parameters of the file described in the section Section 15.2.1, 'Create a Quality Input File'
Delimiter:
,
(comma)Quote:
'
(double quotation marks)Attribute information:
Names on first line
Select Records are CR/LF terminated.
Character encoding:
ascii
For more information on configuring Entities for delimited files, see the Online Help for Oracle Data Profiling and Oracle Data Quality.
Note:
If the file is generated using Oracle Data Integrator These file format parameters should correspond to the file format specified in the Files tab of the datastore definition.After you select the schema settings, click Preview. The Preview mode shows how the data will appear in the Entity, based on your selected schema settings. The data displays below in a list view. Use the Preview mode to customize how the data will appear in the new Entity.
When you are ready to continue, click Close.
Click Next. The Load Parameters dialog opens. Specify the parameters as follows:
Select All Rows.
Leave the default Job name.
Click Next to continue.
In Confirm Settings, review the list of settings and click Finish to schedule the Entity creation job. The Schedule Job window opens.
Click Run Now.
15.2.2.3 Step 3: Verify Entity
During the data import process, Oracle Data Quality for Data Integrator translates your data files into three basic components (Metabase objects): Entities, Attributes, and Rows.
Perform the following list of verification tasks to ensure that the data you expected has been successfully imported to a Metabase and are correctly represented in the Metabase Explorer.
Make sure that for every data file imported you have one corresponding Entity.
Make sure that the column names do not contain any special characters with the exception of underscore (_) or minus sign (-) characters. Minus signs and underscores will be translated into spaces during the data load process.
Make sure that for every field imported you have one corresponding Attribute.
Make sure that you have one Entity Row for every data row imported.
15.2.3 Create a Profiling Project
You can now run a Data Profiling Project with Oracle Data Profiling to find quality problems. Profiling discovers and analyzes the quality of your enterprise data. It analyzes data at the most detailed levels to identify data anomalies, broken filters and data rules, misaligned data relationships, and other concerns that allow data to undermine your business objectives.
For more information on Data Profiling see 'Working with Oracle Data Profiling' in the Online Help for Oracle Data Profiling and Oracle Data Quality.
15.2.4 Create a Oracle Data Quality Project
You can now create an Oracle Data Quality Project to validate and transform your data, and resolve data issues such as mismatching and redundancy.
Metabase Oracle Connection Control
Oracle Data Quality for Data Integrator is a powerful tool for repairing and correcting fields, values and records across multiple business contexts and applications, including data with country-specific origins. Oracle Data Quality for Data Integrator enables data processing for standardization, cleansing and enrichment, tuning capabilities for customization, and the ability to view your results in real-time.
A Quality Project cleanses input files and loads cleansed data into output files. At the end of your Oracle Data Quality project this input file may be split into several output files, depending on the data Quality project.
Important Note: A Data Quality project contains many temporary entities, some of them not useful in the integration process. To limit the Entities reversed-engineered for usage by Oracle Integrator, a filter based on entities name can be used. To use this filter efficiently, it is recommended that you rename in your quality project the entities that you want to use in Oracle Data Integrator in a consistent way. For example rename the entities ODI_IN_XXX
and the output (and no-match) files ODI_OUT_XXX
, where XXX
is the name of the entity.
For more information on Data Quality projects see 'Working with Oracle Data Quality' in the Online Help for Oracle Data Profiling and Oracle Data Quality.
15.2.5 Export the Data Quality Project
Oracle Data Integrator is able to run projects exported from Oracle Data Quality. Once the Data Quality project is complete, you need to export it for Oracle Data Integrator. The exported project contains the data files, Data Dictionary Language (DDL) files, settings files, output and statistics files, user-defined tables and scripts for each process module you in the project. An exported project can be run on UNIX or Windows platforms without the user interface, and only requires the Oracle Data Quality Server.
To create a batch script:
In the Explorer or Project Workflow, right-click the Oracle Data Quality project and select Export... > ODQ Batch Project > No data.
In Browse for Folder, select or make a folder where you want the project to be exported.
Click OK. A message window appears indicating that the files are being copied. This export process creates a folder named after the metabase (
<metabase_name>
) at the location that you specified. This folder contains a projectN sub-folder (where N is the project identifier in Oracle Data Quality). This project folder contains the following folders among others:data: This folder is used to contain input and output data as well as temporary data files. These files have a
.DAT
extension. As you specifiedNo data
for the export, this folder is empty.ddl: This folder contains the entities metadata files (
.DDX
and.XML
). These files describe the data files' fields. They are prefixed witheNN_
, whereNN
is the Entity number. Each entity is described in two metadata files.eNN_<name of the entity>
.ddx
is the description of the entity with possible duplicated columns (suitable for fixed files).enNN_<name of the entity_
csv.ddx
is the description of the entity with non-duplicated columns (suitable for fixed and delimited files). It recommended to use these files for the reverse-engineering process.scripts: This folder contains the batch script
runprojectN
. This script runs the quality process and is the one that will be triggered by Oracle Data Integrator.settings: This folder contains settings files (
.ddt
,.sto
,.stt
,.stx
) and the configuration fileconfig_batch.tbl
.
After the message window has disappeared, examine the folder you have specified and check that all folders and files are correctly created.
Move the exported project to a folder on the run-time machine. This machine must have the Oracle Data Quality Server installed at it will run the quality project.
Open with a text Editor the batch script (
runprojectN
) and the configuration file (config_batch.tbl
) in the/batch/settings
sub-folder of yourprojectN
folder.Perform the following changes to configure the run-time directory in the project.
In
config_batch.tbl
, specify the location (absolute path) of the directory containing theprojectN
folder for theDATABASE
parameter.In
runprojectN
, specify the location (absolute path) of the projectN directory for theTS_PROJECT
parameter.
For example, if you have the
config_batch.tbl
andrunproject2.*
files located inC:oracleoracledqmetabase_ datametabaseoracledqproject2batch
, you should specifyin settingsconfig_batch.tbl:
DATABASE = C:oracleoracledqmetabase_ datametabaseoracledqproject2batch
in scriptsrunprojectN.*: set TS_PROJECT=
C:oracleoracledqmetabase_ datametabaseoracledqproject2batch
Save and close the
config_batch.tbl
file.In runprojectN uncomment the very last line of the file (remove the
::
character at the beginning of the last line).Save and close the runprojectN file.
Oracle Data Integrator uses CSV formatted files (typically, comma-delimited with one header line) to provide the data quality project with input data, and expects output data to be in the same format.
In the
/settings
directory, open with an Editor the settings file corresponding to the first process of your project. This file is typically namedeN_transfmr_p1.stx
(whereN
is the internal ID of the entity corresponding to the quality input file) if the first process is a transformer.Change the following input parameters in the settings file:
In
DATA_FILE_NAME
, specify the name and location (absolute path) of your quality input file in run-time.In
FILE_DELIMITER
, specify the delimiter used in the quality input file.In
START_RECORD
, specify the line number where data starts. For example, if there is a 1 line header, the value should be2
.
For example, if you have the
customer_master.csv
quality input file (comma-separated with one header line) located inC:/oracle/oracledq/metabase_data/metabase/oracledq/Data/
, you should edit the following section:as shown below:
Save and close the settings file.
Also in the
/settings
directory, open the file that corresponds to the settings of the process generating the output (cleansed) data. Typically, for a cleansing project which finishes with a Data Reconstructor process, it is named witheNN_datarec_pXX.stx
. Change the following value in the settings file to give the full path of the generated output file.Save and close the settings file.
If you have several data quality processes that generate useful output files (for example, one data reconstructor per country). Repeat the two previous steps for each of these processes.
15.2.6 Reverse-engineer the Entities
In order to provide the Quality process with input data and use its output data in data integrator's integration processes, it is necessary to reverse-engineer these Entities. This operation is performed using a customized reverse-engineering method based on the Oracle Data Quality RKM. The RKM reads metadata from the .ddx
files located in the /ddl
folder of your data quality project.
To reverse-engineer the Entities of a data Quality project:
Import the RKM Oracle Data Quality into your Oracle Data Integrator project.
Insert a physical schema for the File technology in Topology Manager. Specifying for both, the Directory (Schema) and the Directory (Work Schema), the absolute path of your data folder. For example
C:oracleoracledqmetabase_datametabaseoracledqprojectNdata
This directory must be accessible to the agent that will be used to run the transformations. Oracle Data Integrator will look in the schema for the source and target data structures for the interfaces. The RKM will access the output data files and reverse-engineer them.
Create a File model and reverse the
/ddl
folder.In Designer Navigator expand the Models panel.
Right-click then select New Model.
Enter the following fields in the Definition tab:
Name: Name of the model used in the user interface.
Technology: File
Logical Schema: Select the Logical Schema on which your model will be based.
In the Reverse tab and select:
Parameter Value/Action Reverse: Customized Context: Reverse-engineering Context Type of objects to reverse-engineer: Table KM Select the RKM Oracle Data Quality Set the RKM options as shown in Table 15-1:
Table 15-1 KM Options for RKM Oracle Data Quality
Parameter Default Value Description DDX_FILE_NAME
*.ddx
Mask for DDX Files to process. If you have used a naming convention in the Quality project for the Entities that you want to use, enter a mask that will return only these Entities. For example, specify the ODI*_csv.ddx mask if you have used the ODI_IN_XX and ODI_OUT_XX naming convention for your input and output entities.
USE_FRIENDLY_NAMES
No
Set this option to Yes if you want the Reverse-Engineering process to generate user-friendly names for datastore columns based on the field name specified in the DDX file.
USE_LOG
Yes
Set to Yes if you want the reverse-engineering process activity be logged in a log file.
LOG_FILE_NAME
/temp/reverse.log
Name of the log file.
Click Apply. The model is created, but contains no datastores yet.
Click Reverse. Now, the model contains datastores that you can see in the Models view.
15.2.7 Use Oracle Data Quality Input and Output Files in Interfaces
You can now create in Oracle Data Integrator interfaces sourcing or targeting the data Quality input and output files.
For example, you can:
Create interfaces to load the input file using datastores from various sources.
Create interfaces to re-integrate the output data back into the sources after cleansing.
Metabase Oracle Connection Manager
15.2.8 Run this Quality Project from Oracle Data Integrator
The OdiDataQuality tool executes the batch file to run the Oracle Data Quality project. This tool takes as a parameter the path to the runprojectN script file. It can run either in synchronous (the tool waits for the quality process to complete) or asynchronous mode.
Metabase Oracle Connection Login
For more information about the OdiDataQuality tool and its parameters, see Section A.5.3, 'OdiDataQuality'.
15.2.9 Sequence the Process in a Package
Create a package in Oracle Data Integrator sequencing the following process:
One or more Interfaces creating the Quality input file, containing the data to cleanse.
OdiDataQuality tool step launching the Oracle Data Quality process.
One or more Interfaces loading the data from the Oracle Data Quality output files into the target datastores.
Comments are closed.