VerityDotNet 1.0
C# library for Verity data profiling, quality control, remediation
Static Public Member Functions | List of all members
VerityDotNet.RecFuncs Class Reference

Helper functions to split quoted data records, handle parameter aliases, detect datatypes, and others. char_aliases are tokens to allow specifying disruptive characters. For example, -lsquare- is the [ character which has special meaning in code. More...

Static Public Member Functions

static Dictionary< string, string > GetCharAliases ()
 Retrieve dictionary with key= character alias and value = character such as: { "-comma-", "," }, { "-tab-", "\t" }, { "-pipe-", "|" }, { "-space-", " " }, { "-bslash-", "\\" }, { "-fslash-", "/" }, { "-lparen-", "(" }, { "-rparen-", ")" }, { "-lcurly-", "{" }, { "-rcurly-", "}" }, { "-lsquare-", "[" }, { "-rsquare-", "]" }, { "-dblquote-", """ }, { "-mathpi-", Math.PI.ToString() },{ "-mathe-", Math.E.ToString() },.
 
static Dictionary< string, string > GetCharAliasesReverse ()
 Retrieve dictionary with key= character and value = character alias such as { ",","-comma-" }, { "\t","-tab-" }, { "|","-pipe-" }, { "\\","-bslash-" }, { "/","-fslash-" }, { "(","-lparen-" }, { ")","-rparen-" }, { "{","-lcurly-" }, { "}","-rcurly-" }, { "[","-lsquare-" }, { "]","-rsquare-" }, { ""","-dblquote-" },.
 
static string ConvertCharAliases (string strin)
 Finds and converts character aliases in a string.
 
static string ExtractCharAliases (string strin, Dictionary< string, bool > ignore)
 Finds and converts troublesome characters into aliases.
 
static string ExtractCharAliases (string strin)
 Finds and converts troublesome characters into aliases.
 
static bool IsMathAlias (string valnum)
 Checks if string is math alias of pi or e (-mathpi- or -mathe-)
 
static string GetMathAlias (string valnum)
 Converts math alias of pi and e into their full real number provided by DotNet.
 
static string DelimGetChar (string valnum)
 Converts name of delimiter into its character.
 
static string ConvertSpecialNotation (string valnum)
 Converts the VerityX product special notations into their mapped strings.
 
static List< string > SplitQuotedLine (string linein, string delim)
 Slices an input text line using quoted fields and a delimiter but with possibility not all fields are quoted.
 
static string DetectDataType (string strin)
 Attempts to determine a string value's datatype using multiple characteristics.
 
static string Do_Pad (string valin, string padChar, string padSide, int nlength)
 Pads string either on front or back side.
 
static string AssignDataType (Dictionary< string, long > datatypeDist, Dictionary< string, string > settings)
 Uses distribution of detected datatypes for a field to determine the most likely datatype appropriate to assign to it.This uses threshhold settings and knowledge from curated data sets across multiple domains and data systems.
 
static List< string > AssignDataTypesToFields (List< Dictionary< string, long > > dataypeDistFields, Dictionary< string, string > settings)
 Uses list of distribution of detected datatypes for each field to determine the most likely datatype appropriate to assign to it.This uses threshhold settings and knowledge from curated data sets across multiple domains and data systems.
 
static bool IsFieldItsDataType (string dType, string fieldVal)
 Determines if a field's value is in its specified datatype.
 
static bool IsFieldItsDataType (string dType, string fieldVal, string dateFmt)
 Determines if a field's value is in its specified datatype.
 
static string IsFieldItsFormat (string fieldVal, Field field, bool allowEmpty=false)
 Determines if field value conforms to its defined format (if set)
 

Detailed Description

Helper functions to split quoted data records, handle parameter aliases, detect datatypes, and others. char_aliases are tokens to allow specifying disruptive characters. For example, -lsquare- is the [ character which has special meaning in code.

Member Function Documentation

◆ AssignDataType()

static string VerityDotNet.RecFuncs.AssignDataType ( Dictionary< string, long > datatypeDist,
Dictionary< string, string > settings )
static

Uses distribution of detected datatypes for a field to determine the most likely datatype appropriate to assign to it.This uses threshhold settings and knowledge from curated data sets across multiple domains and data systems.

Parameters
datatypeDistdictionary with keys [string, int, real, date, bool, empty] and for each values = number of instances.This should come from results of AnalyzeQuality.Inspect()
settingsdictionary with keys for various settings including
  • include_empty: bool whether to include number of empty values in statistical calculation.Default is True
  • minfrac: real number minimum threshhold in either percentage (any value great than 1) or fraction(0-1). Default is 0.75
Returns
string with datatype (string, int, real, date, bool) or empty if cannot be determined. will start with notok: if an error occurs

◆ AssignDataTypesToFields()

static List< string > VerityDotNet.RecFuncs.AssignDataTypesToFields ( List< Dictionary< string, long > > dataypeDistFields,
Dictionary< string, string > settings )
static

Uses list of distribution of detected datatypes for each field to determine the most likely datatype appropriate to assign to it.This uses threshhold settings and knowledge from curated data sets across multiple domains and data systems.

Parameters
dataypeDistFieldsList of dictionaries per field with keys [string, int, real, date, bool, empty] and for each values = number of instances. This should come from results of AnalyzeQuality.Inspect()
settingsdictionary with keys for various settings including
  • include_empty: bool whether to include number of empty values in statistical calculation.Default is True
  • minfrac: real number minimum threshhold in either percentage (any value great than 1) or fraction(0-1). Default is 0.75
Returns
string list with datatypes per field (string, int, real, date, bool) or empty if cannot be determined. 0th entry will start with notok: if an error occurs

◆ ConvertCharAliases()

static string VerityDotNet.RecFuncs.ConvertCharAliases ( string strin)
static

Finds and converts character aliases in a string.

Parameters
strinoriginal string
Returns
New string with decoded aliases. If error starts with notok:

◆ ConvertSpecialNotation()

static string VerityDotNet.RecFuncs.ConvertSpecialNotation ( string valnum)
static

Converts the VerityX product special notations into their mapped strings.

Notations:
-comma-    ->  ,
-tab-      ->  \t
-space-    ->   
-pipe-     ->  |
-bslash-   ->  \\
-fslash-   ->  /
-lparen-   ->  (
-rparen-   ->  )
-lcurly-   ->  {
-rcurly-   ->  }
-lsquare-  ->  [
-rsquare-  ->  ]
-mathpi-   ->  math.pi value
-mathe-    ->  math.e value
-crlf-     ->  \r\n
-lf-       ->  \n
Parameters
valnumstring to check
Returns
Returns decoded string or original value is not matched. Starts with notok: if error

◆ DelimGetChar()

static string VerityDotNet.RecFuncs.DelimGetChar ( string valnum)
static

Converts name of delimiter into its character.

Parameters
valnumName or char: comma,pipe,tab,colon,caret,hyphen
Returns
string of character if matched or else empty or if error starts with notok:

◆ DetectDataType()

static string VerityDotNet.RecFuncs.DetectDataType ( string strin)
static

Attempts to determine a string value's datatype using multiple characteristics.

Parameters
strinoriginal string
Returns
string of datatype or starts with notok: if error

◆ Do_Pad()

static string VerityDotNet.RecFuncs.Do_Pad ( string valin,
string padChar,
string padSide,
int nlength )
static

Pads string either on front or back side.

Parameters
valinoriginal string
padCharcharacter to pad with. Use names for some (space, fslash, bslash, dollar, asterisk, tab, hyphen) If not supplied then default is x . Max length of this is 1 character
padSidefront or back
nlengthfinal total length
Returns
string or starts with notok: if error

◆ ExtractCharAliases() [1/2]

static string VerityDotNet.RecFuncs.ExtractCharAliases ( string strin)
static

Finds and converts troublesome characters into aliases.

Parameters
strinoriginal string
Returns
New string with encoded aliases. If error starts with notok:

◆ ExtractCharAliases() [2/2]

static string VerityDotNet.RecFuncs.ExtractCharAliases ( string strin,
Dictionary< string, bool > ignore )
static

Finds and converts troublesome characters into aliases.

Parameters
strinoriginal string
ignoreDictionary of key=string characters to ignore and therefore not extract, like , or [
Returns
New string with encoded aliases. If error starts with notok:

◆ GetCharAliases()

static Dictionary< string, string > VerityDotNet.RecFuncs.GetCharAliases ( )
static

Retrieve dictionary with key= character alias and value = character such as: { "-comma-", "," }, { "-tab-", "\t" }, { "-pipe-", "|" }, { "-space-", " " }, { "-bslash-", "\\" }, { "-fslash-", "/" }, { "-lparen-", "(" }, { "-rparen-", ")" }, { "-lcurly-", "{" }, { "-rcurly-", "}" }, { "-lsquare-", "[" }, { "-rsquare-", "]" }, { "-dblquote-", """ }, { "-mathpi-", Math.PI.ToString() },{ "-mathe-", Math.E.ToString() },.

Returns
Dictionary(string,string)

◆ GetCharAliasesReverse()

static Dictionary< string, string > VerityDotNet.RecFuncs.GetCharAliasesReverse ( )
static

Retrieve dictionary with key= character and value = character alias such as { ",","-comma-" }, { "\t","-tab-" }, { "|","-pipe-" }, { "\\","-bslash-" }, { "/","-fslash-" }, { "(","-lparen-" }, { ")","-rparen-" }, { "{","-lcurly-" }, { "}","-rcurly-" }, { "[","-lsquare-" }, { "]","-rsquare-" }, { ""","-dblquote-" },.

Returns
Dictionary(string,string)

◆ GetMathAlias()

static string VerityDotNet.RecFuncs.GetMathAlias ( string valnum)
static

Converts math alias of pi and e into their full real number provided by DotNet.

Parameters
valnumstring to check if value is -mathpi- or -mathe-
Returns
string of pi or e, or original string. If error, starts with notok:

◆ IsFieldItsDataType() [1/2]

static bool VerityDotNet.RecFuncs.IsFieldItsDataType ( string dType,
string fieldVal )
static

Determines if a field's value is in its specified datatype.

Parameters
dTypefield's defined datatype (int, real, bool, date, string)
fieldValfield value
Returns
bool

◆ IsFieldItsDataType() [2/2]

static bool VerityDotNet.RecFuncs.IsFieldItsDataType ( string dType,
string fieldVal,
string dateFmt )
static

Determines if a field's value is in its specified datatype.

Parameters
dTypefield's defined datatype (int, real, bool, date, string)
fieldValfield value
dateFmtdate format if checking for a date
Returns
bool

◆ IsFieldItsFormat()

static string VerityDotNet.RecFuncs.IsFieldItsFormat ( string fieldVal,
Field field,
bool allowEmpty = false )
static

Determines if field value conforms to its defined format (if set)

Parameters
fieldValfield value to check
fieldField Object
allowEmptybool whether empty values (e.g null) are allowed
Returns
string as bool:message with bool =(true,false) and message= reason. If error, starts with notok:message

◆ IsMathAlias()

static bool VerityDotNet.RecFuncs.IsMathAlias ( string valnum)
static

Checks if string is math alias of pi or e (-mathpi- or -mathe-)

Parameters
valnumstring to check if value is -mathpi- or -mathe-
Returns
bool

◆ SplitQuotedLine()

static List< string > VerityDotNet.RecFuncs.SplitQuotedLine ( string linein,
string delim )
static

Slices an input text line using quoted fields and a delimiter but with possibility not all fields are quoted.

Parameters
lineininput text line
delimdelimiter (tab, pipe, comma, colon)
Returns
List of sliced string values. 0th entry starts with 'notok:' if error

The documentation for this class was generated from the following file: