Entity ID in time series scoring
In this article https://community.ptc.com/t5/IoT-Tips/Considerations-for-Handling-Time-Series-Data/ta-p/818763 entity id is defined as this:
“ENTITY_ID”, [is] the identifier for an entity, such as a machine serial number. The ENTITY_ID field should remain the same as long as there are no missing timestamps and it is within the same asset but should be different for different assets or asset runs in order to accurately assign history during model training and scoring."
"If there are gaps in the time series data, it is recommended to restart the series after the gap as a new entity."
This makes perfect sense to me, in order to avoid mixing training data from different machines or different runs you should separate the dataset with the entity id label. In my case, I have only one machine/system, but several different runs spanning a big time window. I would therefore assign a different entity id for each of this runs.
My doubt comes when asking for predictions. The dataset for scoring needs to include an entity id, this makes total sense when the entity id is separating between different assets, it's basically another feature/label. Now for my case, which entity id should I pass for scoring?
For example, if I have data from 3 runs on 3 different days with a big gap of time between them. In the training dataset I need to assign an entity id for each one, lets say: run1, run2, run3. Now when scoring in the future, which entity id should I use? run1, run2 or run3? Why would I choose one over the other if they were only separated in order to avoid mixing runs?

