Merge File Entries ICR (2365)

Name

Value

NUMBER

2365

IA #

2365

DATE CREATED

1998/04/27

CUSTODIAL PACKAGE

TOOLKIT

CUSTODIAL ISC

San Francisco

USAGE

Supported

TYPE

Routine

DBIC APPROVAL STATUS

APPROVED

ROUTINE

XDRMERG

NAME

Merge File Entries

GENERAL DESCRIPTION


Overview

A file in which entries need to be merged may be entered in the DUPLICATE
RESOLUTION file (file 15.1).  This requires adding the file as one which can
be selected as the variable pointer, and search criteria would usually need to
be specified to assist in identifying potential duplicate pairs (although an
option can be use by which selected pairs can be added directly to the
DUPLICATE RECORD file as verified duplicates).  Verified duplicate pairs may
be approved for merging, and a merge process generated for those approved
pairs.  A DUPLICATE RECORD file entry will also have handle files which are
not associated as normal pointers identified in the PACKAGE file under the
'AFFECTS RECORD MERGE' subfile with special processing routines.

***IF A FILE HAS RELATED FILES WHICH ARE NOT NORMAL POINTERS, THEY SHOULD BE
HANDLED ONLY AS ENTRIES IN THE DUPLICATE RECORD FILE AND THE TOOLKIT OPTIONS
USED FOR MERGES INVOLVING THE FILE.***

The merge utility of Kernel Toolkit as revised by patch XT*7.3*23 provides an
entry point which is available to developers for the merging of one or more
pairs of records (a FROM record and a TO record) in a specified file.  The
merge process me rges the data of the FROM record into that of the TO record
and deletes the FROM record, restoring by a hard set only the zero node with
the .01 value on it until the merge process is completed (such that any
references to that location via pointers will not error out).  Any files which
contain entries DINUMed with the data pairs are then also merged (and any
files which are related to them by DINUM as well).  Any pointers which can be
identified rapidly by cross-references are modifie d so that references for
the FROM entry become references to the TO entry instead.  Following this, any
files which contain other pointers are searched entry by entry to test for
pointers to a FROM entry, and when found are modified to reference the TO
entry.  This search for pointer values is the most time consuming part of the
entire process and may take an extended period depending upon the number of
files that must be searched, the number of entries in those files, and how
many levels of subfiles pointers may be located at.  Since the search through
these files will take the same period of time independent of the number of
pairs which are being merged, it is suggested that as many pairs as convient
be combined in one proc ess.  At the end of the conversion of these pointers,
the zero node stubs will be removed from the primary file and all related
DINUMed files.

The merge process is a single job which is tracked with frequent updates on
location and status from start to finish.  The job can be stopped at any time
if necessary using Task Manager utilities (or in the event of a system crash,
etc.) and restarted at the point of interruption at a later time.


The manner in which data is merged.

When a primary file or a DINUMed files entries are merged, any top level
(single value) fields which are present in the FROM entry which are not
present in the TO entry will be merged into the TO entries data.  Any of these
fields which contain cross-references will be entered using a VA File Manager
utility (FILE^DIE) so that the cross-references will be fired.  Other fields
(those without cross-references) will be directly set into the data global.

If a subfile entry exists in the FROM record which is not present in the TO
record (as identified by the .01 value), that entry will be created with a VA
File Manager utility (UPDATE^DIE) and the rest of the subfile merged over into
the TO record and the cross-references within the subfile and any descendent
subfiles run.

If a subfile entry exists in the FROM record and an identical .01 value exists
in the TO record, the subfile in the FROM record will be searched for any
descendent subfiles which are not present in the TO record subfile.  If such a
subfile is found it will be merged into the subfile in the TO record and any
cross-references in the merged subfile run.

For fields which are simple pointers to the primary file (or any other file
DINUMed to the primary file) the reference to the FROM record will be changed
to a reference to the TO record.  If the field contains a cross-reference this
editing will be performed using a VA File Manager Utility call (FILE^DIE),
otherwise it will be set directly into the global node.

STATUS

Active

DURATION

Till Otherwise Agreed

XDRMERG

COMPONENT/ENTRY POINT

COMPONENT DESCRIPTION

VARIABLES

The entry point EN^XDRMERG provides for merging of
one or more pairs of records in a specified file.  This entry point takes two
(2) arguments, the file number (a numeric value) and a closed reference to the
location where the program will find an array with subscripts indicating the
record pairs to be merged (a text value).  There can be either two or four
subscripts in the data array as described below. the command

D EN^XDRMERG(999000014,"MYLOC")

would result in record pairs specified as subscripts in the array MYLOC to be
merged in a hypotheical file 999000014. The array MYLOC might have been set up
prior to this call in the following manner (or any equivalent way) where the
subsripts represent the internal entry numbers of the FROM and TO records,
respectively.

S MYLOC(147,286)="",MYLOC(182,347)="",MYLOC(2047,192)=""
S MYLOC(837,492)="",MYLOC(298,299)=""

This would result in five record pairs being merged with record 147 (the FROM
record) being merged into record 286 (the TO record), record 182 being merged
into record 347, etc., to record 298 being merged into 299. Merges using the
two subscript format will occur without a specific record of the entries prior
to the merge (The internal entry numbers merged would be recorded under the
file number in file 15.3) An alternative is a four subscript format for the
data array which uses variable pointer formats for the FROM and TO records as
the third and fourth subscripts.  If the merge is performed with this four
subscript array, then a premerge image of the data of both the FROM and TO
records in the primary file and all other merged files (those related by
DINUM) and information on all single value pointer values modified is stored
in the MERGE IMAGE file (file 15.4). For the above example data [assuming that
the global root for the hypothetical file 999000014 is ^DIZ(999000014,] the
four subscript array might be generated using the following code

S MYROOT=";DIZ(99900014,"        <--- note the leading ^ is ommitted
S MYLOC(147,286,147_MYROOT,286_MYROOT)=""
S MYLOC(182,347,182_MYROOT,347_MYROOT)=""
S MYLOC(2047,192,2047_MYROOT,192_MYROOT)=""
S MYLOC(837,492,837_MYROOT,492_MYROOT)=""
S MYLOC(298,299,298_MYROOT,299_MYROOT)=""
;
D EN^XDRMERG(99900014,"MYLOC")



Exclusion of Multiple Pairs For a Record

To insure that there are no unanticipated problems due to relationships
between a specific record in multiple merges, prior to actually merging any
data the various FROM and TO records included in the process are examined, and
if one record is involved in more than one merge, all except the first pair of
records involving that one are excluded from the merge.  If any pairs are
excluded for this reason, a mail message is generated to the individual
responsible for the merge process as indicated by the DUZ. If the following
entries were included in the MYLOC array

MYLOC(128,247)
MYLOC(128,536)  and
MYLOC(247,128)

Only the first of these entries (based on the numeric sorting of the array)
would be permitted to remain in the merge process, while the other two pairs
would be omitted).  And although it may seem unlikely that someone would
indicate that a record should be merged into two different locations, while
another location should be merged into one that was merged away, if the pairs
are selected automatically and checks aren't included to prohibit such
behavior, they will show up.  That is why the merge process won't include more
than one pair with a specific record in it.

Problems Related To Data Entry While Merging

The Merge Process has been designed to combine data associated with the two
records in the manner described above.  On occasion, however, there are
problems which cause VA File Manager to reject the data that is being entered.
This may happen for a number of reasons.  Some that have been observed were:
Clinics which had been changed so they no longer were indicated as Clinics (so
they wouldn't add to the number that people had to browse through to select a
clinic), but were rejected since the input transform checked that they be
clinics; Pointer values that no longer had a valid value in the pointed to
file (dangling pointers); Fields that have input transforms that prohibit data
entry.  :-)

It is possible to use a validity checker on your data prior to initiating the
actual merge process (this is the action taken by merges working from the
Potential Duplicate file).  The data pairs are processed in a manner similar
to the actual merge, so only that data in any of the files which would be
merged and for which the data would be entered using VA File Manager utilities
for the specific pair are checked to insure they will pass the input
transform.  Any problems noted are incorporated into a mail message for
resolution prior to attempting to merge the pair again, and the pair is
removed from the data array that was passed in.  Pairs which pass through this
checking should not encounter any data problems while being merged.

VARIABLES

TYPE

VARIABLES DESCRIPTION

FILE

Input

Specifies the FILE NUMBER of the file in which the
indcated entries are to be merged.

ARRAYNAM

Input

This variable contains the name of the array as a
closed root under which the subscripts indicating the FROM and TO entries will
be found.  The data  may have either two or four subscripts descendent from
the array which is passed in.  Please see the overall description provided for
examples of its usage.

RESTART

This entry point is used to restart a merge which has
been stopped.  The information necessary for restarting may be viewed using
the CHKLOCAL entry point in XDRMERG2 (see LOCAL MERGE STATUS).

VARIABLES	TYPE	VARIABLES DESCRIPTION
FILE	Input	Specifies the FILE NUMBER of the file in which the indcated entries are to be merged.
ARRAYNAM	Input	This variable contains the name of the array as a closed root under which the subscripts indicating the FROM and TO entries will be found. The data may have either two or four subscripts descendent from the array which is passed in. Please see the overall description provided for examples of its usage.
PHASE	Input	This variable indicates the phase of the merge process in which the merge should be restarted. The value is a number in the range of 1 to 3, with no decimal places. Phase 1 is usually quite short and is the merge of the specified entries in the primary file. Phase 2 is the merging of entries in files which are DINUMed to the primary file and changing pointers which can be identified from cross-references. Phase 3 is finding pointer values by searching each entry in a file. This will usually be the longest phase of the merge process.
CURRFILE	Input	This is the current file NUMBER on which the merge process is operating.
CURRIEN	Input	This is the current internal entry number in the file on which the merge process is operating.