Skip to main content

Custom Transforms Using Python/Javascript/JSON

This article provides information about creating custom attribute and record transformations in Nexla using Python, JavaScript, and/or JSON code.

1. What Are Custom Code Transformations?

Nexla includes many pre-built transformation functions within the Nexset Designer that can be used to accomplish most data flows; however, some data flows require specific, customized data transformations. For these operations, users can create custom attribute and record transforms in Nexla using Python, JavaScript, and/or JSON code. Custom transforms can also be used for transformations that would otherwise require multiple pulldown-menu attribute selections or when operating on nested arrays of objects.

  • Custom attribute transforms are customized transformation functions that modify attributes in an input Nexset. Custom attribute transforms follow the function signature of transformAttribute(input, metadata, args) and receive the entire incoming record as input.

  • For data flows that require the application of a consistent set of transform rules regardless of the input Nexset schema, custom record transforms can be used to specify the entire output Nexset record. Custom record transforms follow the function signature of transform(input, metadata, args), receiving the entire incoming Nexset record as input and any associated metadata as metadata.

  • Both custom attribute and record transforms can be constructed within individual Nexset transformations, or they can be created as reusable transforms that can be applied to multiple Nexsets and shared with other users in the organization.

    For more information about creating, using, and sharing reusable transforms, see the Reusable Attribute Transforms and Reusable Record Transforms articles in the Help Center.

2. Create a Custom Transform in the Nexset Designer

To learn how to access the Nexset Designer, see the Nexset Designer Overview article in the Help Center.

  1. Click the Add_Rule_Group.png button in the Nexset Rules panel, and select Transform_Code.png from the menu that appears.

      Rule_Menu.png

  2. In the newly created Transform: Code rule group, select the desired coding language—Python, JavaScript, or JSON—from the pulldown menu. These options are found under the "Write Custom Code" category at the top of the list.

    JavaScript code written in Nexla must be in vanilla JavaScript format (no ES6).

      Select_Language.png

  3. Enter the necessary code in the text field below the pulldown menu, adhering to the function signature.

3. Function Signatures

When writing custom transform code, users must follow the function signature corresponding to the selected language. These function signatures are displayed in the rule group text field when each language is selected.

  Function_Signatures2.png

  • The function signatures for each language are shown below. In these signatures, input contains the input Nexset record as a JSON object, and metadata contains Nexla metadata attributes about the input Nexset.

    Python

    def transform(input, metadata, args):  
    return input

    JavaScript

    function transform(input, metadata, args) {  
    return input;
    }

    JSON

    []
  • Any custom code and unlimited supporting functions can be entered in the text field, but the function signature cannot be changed or removed.

4. Metadata Parameter

The metadata parameter contains Nexla metadata attributes about the input Nexset. These metadata attributes comprise information about how the Nexset was brought into Nexla and can be used as needed when specifying transform rules for the output Nexset.

The table below lists the available Nexla metadata attributes.

Nexla Metadata AttributesAttribute NameDescription
ingestTimeNexset record ingestion time in Unix epoch milliseconds1538562212976
resourceIdID of the resource at the root of the flow12302
resourceTypeResource type of the resource ID metadataSOURCE
runIDUnique flow execution run ID of the source1638205638431
sourceBucketTop-level bucket/directory of the data sourcedaily-logs.example.com
sourceOffsetData source offset of the record, i.e., the line number of the file to which the record belongs1001
sourcePathPath of the ingested resource from the top-level bucket/directory/API path/database/etc.hourly_events/2017-07-28/1700.json
sourceTypeData source type of the input NexsetFTP
tagsOptional captured key-value properties of the input Nexset source{"from_email": "test@nexla.com"}
trackerIdUnique record tracker containing lineage information about the record as it flows through NexlacjExMDA5OjlyNjY2OnRyYWRlc18wOV 8yNy5jc3Y6MjoxOjE1NjcwNtgyNdy5M DA7Mjl2NzE6MzU5OTc7ODgzNzoxOjl 5Mw
transformTimeNexla record transformation time in Unix Epoch milliseconds1498776641620
transformTimeISO8601Nexla record transformation time in ISO8601 format2017-06-29T22:50:41Z

5. Using Nexla's Pre-Built Functions in Custom Transforms

Nexset rules available in the Nexset Designer can also easily be applied in custom-coded transforms created with Python or JavaScript. The table below lists the function signatures of Nexla's predefined transform functions that may be useful in custom transforms.

Only pre-built functions that are typically useful in custom transforms are shown in the table. For a complete list of Nexla's pre-built transform rules, see the List of Nexla's Pre-Built Transforms article in the Help Center.

To use any of the listed functions, replace \<transform\> with the function name, and add the relevant parameters as arguments (replacing param1...paramN) to the following code format:

nexla_fn.call(\<transform\>',param1...paramN)

Security

FunctionDescriptionParametersExample Call
stringHashTokenize the input value using the MD5 or SHA256 algorithm1. hash_method: Hash encryption method – either 'md5' or 'sha256' for the corresponding tokenization algorithm 2. stringToHash: Input string or attribute of which to return the encrypted valuenexla_fn.call("stringHash","sha256","test")
integerHashEncrypt an integer value1. hash_method: Hash encryption method – either 'md5' or 'sha256' for the corresponding tokenization algorithm 2. integerToHash: Input integer or attribute of which to return the encrypted valuenexla_fn.call("integerHash","sha256",123)

Lookups

FunctionDescriptionParametersExample Call
toMapValue2Identifies the lookup row of an input attribute based on matching with the primary key values of the specified lookup and returns the value of the lookup secondary key column1. lookupId: Nexla static or dynamic lookup from which to fetch entries 2. lookupOnKey: Input attribute or value for filtering lookup entries 3. lookupKeyToFetch: Secondary key column from the lookup from which the value of the matching row will be returnednexla_fn.call("toMapValue2",1200,"1","code")
getMapReturns the entire object contained in a lookup table based on exact matching of an input with the primary key of the specified lookup1. lookupId: Nexla static or dynamic lookup from which to fetch entries 2. lookupOnKey: Input attribute or value for filtering lookup entriesnexla_fn.call("getMap",1200,"1")

Date & Time

FunctionDescriptionParametersExample Call
epochFromStringConverts a valid ISO8601-formatted string timestamp to the corresponding Unix Epoch time1. inputString: ISO8601-formatted input string to be converted to Unix epoch timenexla_fn.call("epochFromString","2020-11-10T07:44:25Z")
extractDateExtracts specific date parts from a date-time string1. toFormat: Desired format of output that includes the extracted date-time parts of input string 2. inputString: String value that needs to be parsed (must be of a valid date-time format)nexla_fn.call("extractDate","dd-MMM-yy","09/28/2017 11:43:00")
convertTimeZoneConverts a valid date-time string from one time zone to another time zone1. fromZone: Time zone from which the date-time string will be converted 2. toZone: Time zone to which the date-time string will be converted 3. timeToConvert: Date-time string to be converted to a different time zonenexla_fn.call("convertTimeZone","GMT","America/New_York","2020/11/09")
timestampDifferenceReturns the absolute difference of two timestamps in one of the supported ISO8601 units of time1. differenceUnit: Time unit for computing the difference (sss – milliseconds, s – seconds, m – minutes, h – hours, d – days, w – weeks) 2. subtractTs: First timestamp to be used for computing the difference between timestamps 3. fromTs: Second timestamp to be used for computing the difference between timestampsnexla_fn.call("timestampDifference","m","2016-03-22T22:54:14Z","2016-03-22T22:44:14Z")
iso8601FromEpochConverts a valid Unix Epoch time to the corresponding ISO8601-formatted string timestamp1. unixEpoch: Time in Unix Epoch format to be converted to ISO8601 formatnexla_fn.call("iso8601FromEpoch",1604994265)
iso8601FromStringConverts a valid date-time string to the corresponding ISO8601-formatted string timestamp1. timeString: Time string to be converted to ISO8601 format (must be in a valid date-time format)nexla_fn.call("iso8601FromString","25/02/2017")

Mathematical Operations

FunctionDescriptionParametersExample Call
minReturns the minimum value of elements in an array1. inputArray: Array attribute from which to extract the minimum valuenexla_fn.call("min",[2,4,6])
maxReturns the maximum value of elements in an array1. inputArray: Array attribute from which to extract the maximum valuenexla_fn.call("max",[2,4,6])
avgReturns the average value of elements in an array1. inputArray: Array attribute of which to extract the average valuenexla_fn.call("avg",[2,4,6])
bucketSegments input attribute values into equally sized bins1. numberOfBuckets: Size of the bucket or segment (the default size is 10) 2. input: Number or array of numbers to be segmented into binsnexla_fn.call("bucket",5,[11,21,99])

Array Operations

FunctionDescriptionParametersExample Call
joinJoins all elements of an array into a string1. delimiter: Character or string separator for joining array elements 2. arrayOfStrings: Input array to be converted to a stringnexla_fn.call("join"," and ",["First","Second"])
toStringWithLeadingZeroesConverts an integer to a string of at least a minimum length1. lengthOfOutput: Minimum desired length of the output string 2. integerToConvert: Input number to be convertednexla_fn.call("toStringWithLeadingZeroes",5,123)

Location

FunctionDescriptionParametersExample Call
GeoDetectionExtracts location information from a valid IP address string1. infoToExtract: Location information to extract ('city', 'continent', 'country', 'country_code', 'dma', 'lattitude', 'longitude', 'postal', 'region', 'region_code', or 'timezone') 2. fromIPAddress: Valid IP Address string to be parsednexla_fn.call("GeoDetection","country_code","73.223.52.74")

User Agent

FunctionDescriptionParametersExample Call
UserAgentDetectionExtracts device information from a valid User Agent string1. infoToExtract: Device information to extract ('browser', 'browser_version', 'device_make', 'device_type', or 'operating_system') 2. fromUserAgent: Valid User Agent string to be parsednexla_fn.call("UserAgentDetection", "device_make","Mozilla/5.0(iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15(KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1")

Automotive

FunctionDescriptionParametersExample Call
VINDetectionExtracts vehicle information from a valid VIN string1. infoToExtract: Vehicle information to extract ('country', 'manufacturer', or 'year') 2. fromVIN: Valid VIN string to be parsednexla_fn.call("VINDetection","year","1GNDM19X35B110457")

Artificial Intelligence

FunctionDescriptionParametersExample Call
oneHotEncodingPerforms one-hot encoding of an input attribute to produce an object1. lookupId: Nexla lookup based on which to perform encoding 2. attrToEncode: Attribute that, if present as a key in the lookup, will be encoded to 1nexla_fn.call("integerHash",1234,\<attr>)

String Operations

FunctionDescriptionParametersExample Call
grokParses input unstructured log data into an object with structured fields using a selected or user-defined valid grok pattern1. grokPattern: Grok pattern to extract 2. inputString: Input string to be parsed (usually text streamed from a standard log file, i.e., server logs)nexla_fn.call("grok","%{SYSLOGBASE}","Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=\20130101142543.5828399CCAF@mailserver14.example.com\")
regexFindReturns an array of all matches of the regular expression in the input string1. regexPatternBase64: Base 64-encoded regular expression pattern against which to test the input string 2. inputString: Input string or text to test against the regular expression patternnexla_fn.call("regexMatch","W2EtekEtWl1bMS05XQ==","abcd1234e5")
regexMatchReturns boolean true or false depending on whether or not the input string matches provided regular expression1. regexPatternBase64: Base 64-encoded regular expression pattern against which to test the input string 2. inputString: Input string or text to test against the regular expression patternnexla_fn.call("regexMatch",XihbYS16QS1aMC05X1wtXC5dKylAKFthLXpBLVowLTlfXC1cLl0rKVwuKFthLXpBLVpdezIsNX0pJA==","test@acme.com")
toJsonStringConverts a JSON object/array to a JSON string1. objectToStringify: Input object/attribute to be flattened into a JSON stringnexla_fn.call("toJsonString",{"one":1,"two":2})

6. Examples Using Pre-Built Nexla Functions

This section provides some examples of using Nexla's predefined functions in custom transforms. The custom code for each transform example is shown in both Python and JavaScript.

6.1 Transform an Attribute Using the Nexla Hashing Function

This is a simple example of defining a custom transform to hash an input attribute using Nexla's predefined hashing function.

  • Below is an input record from a Nexset:

    {  
    "prod_id": "472465",
    "prod_name": "Google Chromecast",
    "price_per_unit": "35",
    "quantity": "126"
    }
  • The custom transform in this example should generate an output record based on the following rules:

    • Rule 1: Create a new attribute, "hash_prod_name", that is an MD5-hashed representation of the value of the input attribute "prod_name"

    • Rule 2: Pass through the "prod_id" and "prod_name" attributes without changes

    • Rule 3: Omit the "quantity" attribute

    A transform following these rules should produce the following output from the input shown for this example:

    {  
    "hash_prod_name": "4fdf0a2701a1105890efbb75e2c7d0b7",
    "prod_id": "472465",
    "prod_name": "Google Chromecast"
    }
  • Typically, the Nexset Designer is recommended for creating this type of transform, as transforms constructed in this way are easy to maintain and understand. However, the following code snippets can also be used to obtain the same result: f

    Python

    def transform(input, metadata):  
    output = {}
    output["prod_id"] = input.get("prod_id")
    output["prod_name"] = input.get("prod_name")
    output["hash_prod_name"] = nexla_fn.call('stringHash','md5',input.get("prod_id"))
    return output

    JavaScript

     function transform(input, metadata) {  
    var output = {};
    output.prod_id = input.prod_id;
    output.prod_name = input.prod_name;
       output.hash_prod_name= nexla_fn.call('stringHash','md5',input.prod_name);
       return output;
    }

6.2 Derive an Attribute from a Lookup Table

One of the most powerful Nexla transforms is the ability to look up a value from a previously created lookup table. This example demonstrates how to call Nexla lookups from a lookup table in a custom transform.

For more information about creating and transforming with static and dynamic lookups in Nexla, see the Help Center articles Create a Static Lookup, Create a Dynamic Lookup, and Transforming with Data Lookups.

  • Consider a dynamic lookup (id: 1630) that contains mapping between IDs ("id") and names ("name"), with "id" as the primary key.

    Sample Lookup: (ID - 1630)id
    135234Apple Earpods
    472465Google Chromecast
  • Below is an input record from a Nexset to which this lookup could be applied:

    {  
    "prod_id": "472465",
    "manufacturer": "Google",
    "price_per_unit": "35",
    "quantity": "126"
    }
  • The custom transform in this example should generate an output record based on the following rules:

    • Rule 1: Pass the "prod_id" and "manufacturer" attributes without changes.

    • Rule 2: Create the new attribute "prod_name", which should be equal to the value of the
      "name" attribute in the lookup table row in which the "id" value is equal to the value of the
      "prod_id" attribute in the input record.

    • Rule 3: Omit the "price_per_unit" and "quantity" attributes.

         A transform following these rules should produce the following output from the input shown for this example:

{  
"prod_id": "472465",
"manufacturer": "Google",
"prod_name": "Google Chromecast"
}
  • The following code snippets can be used to obtain the result shown above:

    Python

    def transform(input, metadata):  
    output = {}
    output["prod_id"] = input.get("prod_id")
    output["manufacturer"] = input.get("manufacturer")
    output["prod_name"] = nexla_fn.call('toMapValue2',1630,input.get("prod_id"),'name')
    return output

    JavaScript

     function transform(input, metadata) {  
    var output = {};
    output.prod_id = input.prod_id;
    output.manufacturer = input.manufacturer;
       output.prod_name = nexla_fn.call('toMapValue2', 1630, input.prod_id,'name');
       returnoutput;
    }