Generic Graph And Psets

Index What are psets? Creation of psets. What is analysis_level parameter? Achieving data lineage for generic graphs using psets. Physical datasets and logical EME datasets. Parameters to handle parallel running jobs calling the same graph. Capturing job statistics details in the EME when using generic graphs.

What are psets and how are they created Creating a set of input parameter and value pairs (psets). You do the above, using the Input Values Editor in the Edit menu, which allows you to specify a set of values for the graph's formal parameters, then save it as a separate .pset (parameter set) file in any of the directories under the private sandbox. Steps: a. Select Edit  Input Values ... from the GDE menu. This appears same as the graph parameter editor, with two columns in it, the parameter name and value. b. For each formal parameter enter the required value in the value field. c. Then select File  Save As and save the same value set as <graph name>.pset under the private sandbox’s pset directory. Note: The editor defaults to the project's mp directory as the location of the new .pset file you need to navigate to pset directory in the sandbox.

Along with the existing formal parameters of the generic graph, define a formal parameter called analysis_level and set it’s value to none .

Check in the generic graph from common sandbox to the EME.

Dependency analysis will not be performed on the generic graph due to analysis_level parameter’s value.

Each separate input values set you create in this step represents a separate instance of the graph. To enable the Job Tracking of the generic graph, for different such value sets, simply check these .pset files with different input value sets into the EME data store. This graph instance represented by the .pset file is analyzed and saved in the EME data store as a graph object. For .pset file to be analyzed set analysis_level parameter’s in each parameter set to expand . This was mandatory in Abinitio V-13. NOTE: Abinitio V-14 automatically expands the psets when they are checked in.

Achieving data lineage for generic graphs using psets. Distinct values of logical EME datasets are passed from different psets to the same generic graph. This is done to achieve data lineage. When psets are checked in they are expanded and dependency analysis takes place. Different instances of the generic graph will show up in EME with unique values of logical datasets.

EME view of distinct instances of generic graph: As above different data lineage are achieved in two instances of the same graph in EME.

Physical dataset names overwrites the logical EME dataset names passed from psets. Physical dataset names are set and then passed while executing the graph from within the wrapper via pset. For e.g. exporting physical datasets Calling graph passing parameters

Handling concurrent running multiple instances of a graph AB_JOB_PREFIX – To avoid problems with multiple instances of a graph being run concurrently in the same directory, you can make the AB_JOB value unique by exporting the AB_JOB_PREFIX configuration variable. For e.g. AB_JOB_PREFIX should be assigned any dynamic value. In the e.g. above it is assigned to process id (PID=$$). Alternatively date timestamp in YYYYMMDDHHMISS format can also be assigned to it. Setting this parameter makes sure that AB_JOB will now resolve to ${AB_JOB_PREFIX}${AB_JOB} and thus recovery files also will get created with different names.

Capturing job statistics details in the EME when using generic graphs AB_AIR_JOB_GRAPH – Specifies the graph/application being run so that it may be linked to the job object. - When a generic graph is called the job statistics are stored in the EME under the name of the generic graph . This causes confusion and discrepancies when tracking stats in EME because a generic graph may be used in multiple projects. The objective is to store job statistics under the pset name so that they can be correlated with the logical use of the generic graph. - This parameter needs to be set in the calling script/program to have a generic graph reposit tracking to the .graph (pset version) of the graph. - If the graph is generic then you should set AB_AIR_JOB_GRAPH because you want the job to be associated with pset instance of the graph which does the specific task according to values passed through pset.

In Coop Sys 2.14 and above Benefits Job statistics will be reposited with the logical use of the graph The statistics will be accurately reported by the appropriate job group or project Performance improvement in graph execution time.

Please read the below document for more detail : /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Developer_Guide.pdf /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Reference.pdf

Generic Graph And Psets

More Related Content

What's hot (20)

Similar to Generic Graph And Psets (20)

Generic Graph And Psets