Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Scope of this Document
This document outlines a new enhancement to the Shoplogix Analytics architecture which mitigates issues relating to partial/incomplete data in the Analytics Portal.
Details About
...
Partial Build Failures
BackgroundIssue: During the pre-stages before a cube build in the Analytics Portal, some core data queries produce incomplete information, which is later used to build a cube.
Issue: Sometimes, individual queries to a Saas server will fail or time-out, resulting in storing incomplete data into the intermediate database (i.e. missing data for some machines). Additionally, on rare occasions, a server may be temporarily unavailable which can cause an entire server to be skipped. This partial data is ‘staged’ for an upcoming cube build. Later, when the cube builds, it will build using this partial data. This partial data will then be visible to users in their Analytics dashboards.
Impact: This primarily impacts the live-data window (within 45 to 60 days), for random groups of machines and servers. This missing data is temporary until the next build cycle.
Enhancement: Skipping Partial Cube Builds
In order to To protect against partial data being built into a cube, a new mechanism has been implemented which will detect and skip builds using partial data. Partial data refers to cases where either:
One or more servers that should be included are missing from the data
Less than 100% of the machines on a given server return data when queried
When partial data is detected in the intermediate database, the ‘trigger’ to build the next cube will be disabled, which effectively will skip that build.
Result / Side Effects of Skipping Partial Builds
When a build is skipped due to detected partial data, it means that the previous cube build will remain as the source for analytics dashboards. For example, let’s assume cubes build every 30 minutes, and the last successful build was at 10:00 AM. If the 10:30 AM build cycle has partial data detected, it will be skipped, and the existing 10:00 AM cube build will remain in place until it is later replaced by the next build cycle at 11:00AM00 AM.
Although unlikely, if more than one partial build occurs in a row, the ‘build skipping’ will continue, and the period between cube builds will extend for a longer period until a successful build is finally created. This can be seen using the ‘last build’ timestamp at the top of the dashboard.
Appendix: Analytics Architecture Overview
...