The Metadata for the 8.1 release has been updated from that of the previous releases as part of the architecture update that is new in 8.1. The goal of this blog post is to inform you of these changes to the metadata, and where you can find more information on both the new metadata format, as well as all the other changes that are new for the 8.1 release. New Structure The image below provides an excerpt of what the metadata file itself will look like in practice. The table below provides a definition for what each field in the metadata is and how it should be populated in your metadata file. Parameter Description Required/Optional fieldName The exact name of the field as it appears in the data file. Required values A list of the acceptable values for the field. Note: For Ordinal opTypes, the values must be presented in the correct order. Required if the opType is Ordinal Optional for Categorical opType Do not use for Boolean and Continuous range For a Continuous field, defines the minimum and maximum values the field can accept. For informational purposes only. Optional dataType Describes what type of data the field contains. Options include: Long, Integer, Short, Byte, Double, Boolean, String, Other. Note: Select the most accurate dataType. Selecting the String dataType for numeric data can lead to undesirable results. Required opType Describes how the data in the field can be used. Options include: Categorical, Boolean, Ordinal, Continuous, Informational, Temporal, Entity_ID Required timeSamplingInterval An integer representing the time between observations in a temporal field. Required if the opType is Temporal Do not use for other opTypes isStatic A flag indicating whether or not the value in a temporal field can change over time. Marking a field as static reduces training time by removing redundant data points for fields that do not change. Optional Things to Remember Remember that the Metadata file that you create will need to match the data file that you have; furthermore, all of the columns that you have in your dataset will need to be represented in the metadata file. The metadata file needs to be a JSON file. Setting the opType parameter incorrectly can have a severe impact on system performance. For example, setting a numerical field that has thousands of different values as categorical instead of continuous will cause the system to handle each value as an independent category, instead of just a number, which will result in significantly longer processing time. Additional References For more information on all the other changes that are new in the 8.1 release please follow this link for the complete reference document. Feel free to use the blank example metadata file attached to this post to help you get started on your own.
View full tip