VerityDotNet 1.0
C# library for Verity data profiling, quality control, remediation
|
Lookup dictionary object for transform processing. Transforms have an operation (Op) that allows assigning a value based on looking up the current value in a dictionary. 1, 2, 3 keys are allowed with the replacement value coming from the following column. This Op is in transform_types for category=assignment, function=lookup. The description of this function and how it uses the lookup is: More...
Classes | |
class | LookUpDict |
Dictionary with keys (either 1,2,3 field values) mapped to replacement value. The keys can use wild cards and also special notations for AND and NOT conditions. More... | |
class | LookUpRec |
Record within a lookup dictionary. More... | |
Static Public Member Functions | |
static LookUpDict | MakeLookUpFromFile (string title, string fileURI, string delim, bool isCaseSens, int numKeys) |
Builds a LookUpDict from text file. | |
static LookUpDict | MakeLookUpFromList (string title, List< string > lkupList, string delim, bool isCaseSens, int numKeys) |
Builds a LookUpDict from list of strings similar to reading from a text file. | |
static LookUpDict | ExtractLookUpRecordKeyInfo (LookUpDict lookUpDict) |
extracts lookup key AND and NOT conditions for each record and each key | |
Lookup dictionary object for transform processing. Transforms have an operation (Op) that allows assigning a value based on looking up the current value in a dictionary. 1, 2, 3 keys are allowed with the replacement value coming from the following column. This Op is in transform_types for category=assignment, function=lookup. The description of this function and how it uses the lookup is:
Assigns a value from a lookup list based on matching values to keys where keys can use wildcards. The match can be made to one, two, or three source fields in the record with field 1 always the current value while fields 2 and 3 are optional if set in Param2. Leave Param2 empty to use only 1 field as the match value. All selected fields must match their respective conditions for a lookup result to be assigned. The conditions and the result to assign are in an object for each list entry with properties: key1, key2, key3, value defined as example {'key1':'top*','key2':'blue','key3':'left','value':'Orange'}. Conditions can use front and/or back wildcard (*) like top, night, *state to allow token matching. To use multiple conditions for the same replacement value ( OR condition ), enter them as separate list entries. To use AND and NOT conditions, use special notations as delimiters: top*-and-*night-not-*blue* which means a match requires both top* and *night be true as well as no instances of the token blue. Param1: title of list that has been pre-loaded into array of lists as part of initialization. Param2: Fields 2 and 3 both of which are optional and if both supplied use pipe to delimit as with color|position. For this example, current value must start with top (key1 condition), the field color must contain blue, and the field position must end with left. All of these must be true for a match in which case the value of Orange is assigned.
|
static |
extracts lookup key AND and NOT conditions for each record and each key
lookUpDict | original LookUpDict |
|
static |
Builds a LookUpDict from text file.
File will be read and columns extracted by splitting lines with delimiter defined by delim (comma, pipe, tab, colon). Empty and comment (begins with # or //) lines are ignored. First data containing line must contain delimited field names. num_keys specifies how many columns will be used as keys (1-3) and then the next column will be used as value. This number of columns must be present after splitting. If double quote is in line read from file then more precise (but slower) column separation will be used.
title | LookUpDict title as name to use in Transform when specifying which dictionary to use |
fileURI | OS URI to open and read file |
delim | name of delimiter to parse fields (comma, pipe, tab, colon) |
isCaseSens | bool whether names and values are case sensitive (default false) |
numKeys | number of keys used for match conditions 1-3 |
|
static |
Builds a LookUpDict from list of strings similar to reading from a text file.
List contains what would be read from file. Each entry is split with delimiter defined by delim (comma, pipe, tab, colon). Empty and comment (begins with # or //) lines are ignored. First data containing line must contain delimited field names. num_keys specifies how many columns will be used as keys (1-3) and then the next column will be used as value. This number of columns must be present after splitting. If double quote is in line read from file then more precise (but slower) column separation will be used.
title | LookUpDict title as name to use in Transform when specifying which dictionary to use |
lkupList | list of strings corresponding to lines read from a file |
delim | name of delimiter to parse fields (comma, pipe, tab, colon) |
isCaseSens | bool whether names and values are case sensitive (default false) |
numKeys | number of keys used for match conditions 1-3 |