Shoplogix Historical Data - Big Data Primer

This document is a primer for the big data patterns used by our database to provide the right data at the right level of detail. If you are looking to better understand historical data in Shoplogix Analytics Portal, this document is for you.

Shoplogix data in Analytics Portal is always accurate to the:

  • Job

  • Shift

  • Reason

  • State

This means you can filter your historical data by any of these dimension at any time frame.

Data Resolution

To balance performance with data resolution:

  • Data over 60 days old is stored at the daily level.

  • Data less than 60 days old is stored at the hourly level.

This allows for yearly and other types of analysis previously unavailable in Shoplogix reporting tools, as this means reports process 24 time less records after 2 months. The trade off is that analyzing precisely 6pm-7pm (1 hour) is only available on 2 months worth of data.

Historical data is available on:

  • [Fact Core]

  • [Dim Jobs]

  • [Dim Shift Instances]

  • [Dim Calendar DateTime]

  • [Fact Scrap Reasons]

and no other tables.

For example:

  1. If you want to view data from the previous 3 months, this is possible only at the daily level, not hourly

  2. If you want to view data from the previous 3 weeks, this is possible at the hourly level, in addition to daily level.

Data Refresh and the 60 day data lock-in

Once data is older than 60 days, Analytics portal will lock-in that data exactly as it is for historical reporting purposes.

Refreshing years of historical data is time consuming for the servers and your CS engineer can discuss options if data past this 60 day thresholds needs to be refreshed.

Data Load Rate

Initially, historical data needs to be loaded for new customers or if a customer requests changes causes history to be invalidated. The “backloader” loads roughly 1 day’s worth of data per hour. So you will see about one month of data per day until it loads back to the maximum load date.

Maximum History Load Date

By default each machine will try to load back until 2018-01-01. If more data is needed file a ticket and Product will help work through the use case.