Skipping Partial Cube Builds
Scope of this Document
This document outlines a new enhancement to the Shoplogix Analytics architecture which mitigates issues relating to partial/incomplete data in the Analytics Portal.
Details About Partial Build Failures
Issue: During the pre-stages before a cube build in the Analytics Portal, some core data queries produce incomplete information, which is later used to build a cube.
Impact: This primarily impacts the live-data window (within 45 to 60 days), for random groups of machines and servers. This missing data is temporary until the next build cycle.
Enhancement: Skipping Partial Cube Builds
To protect against partial data being built into a cube, a new mechanism has been implemented which will detect and skip builds using partial data. Partial data refers to cases where either:
One or more servers that should be included are missing from the data
Less than 100% of the machines on a given server return data when queried
When partial data is detected in the intermediate database, the ‘trigger’ to build the next cube will be disabled, which effectively will skip that build.
Result / Side Effects of Skipping Partial Builds
When a build is skipped due to detected partial data, it means that the previous cube build will remain as the source for analytics dashboards. For example, let’s assume cubes build every 30 minutes, and the last successful build was at 10:00 AM. If the 10:30 AM build cycle has partial data detected, it will be skipped, and the existing 10:00 AM cube build will remain in place until it is replaced by the next build cycle at 11:00 AM.
Although unlikely, if more than one partial build occurs in a row, the ‘build skipping’ will continue, and the period between cube builds will extend for a longer period until a successful build is finally created. This can be seen using the ‘last build’ timestamp at the top of the dashboard.
Appendix: Analytics Architecture Overview