LIVE Projection Dataset Reporting
Introduction
Users of the Aeros LIVE system may want to archive the results of the reporting system, warehouse them for distribution to other systems, or override the outputs. Aeros CONNECT offers users the ability to run projection reports and output them to a data warehouse.
Prerequisites
To warehouse Aeros LIVE projection data, we will need:
An Aeros Live installation, of at least version 1.47.11.0 or greater
A designated data warehouse database with certain tables and procedures (described below)
CONNECT permissions for creating and scheduling automated processes
An installed CONNECT Scheduler Service, if we wish to schedule execution of warehouse batches.
Configuration
Warehousing projections is an update to the Data Warehouse 1.0 Database, so we'll need a version of that which includes the "Process", "Process_Batch" and "Process_Message" tables. The Process table will need a new record of Process_Key 5, and like other import types before it we will use the Process_Start column to identify the first week-ending date that we will calculate batches for.
We'll need the Fact_Live_Projection_Weekly table. This table will require three columns:
Record_Key
General_GroupID
General_ReportStartDate
We'll also need to verify that the stored procedure "spDeleteFactLiveProjectionWeekly" has been installed to our data warehouse. This will be called by CONNECT to clean up records from periods that we are recalculating.
If we have not setup the data warehouse in CONNECT, prior to this, we will need a ConfigConnect record of integrate system type 10, and product type 15, with a connection string pointing to our data warehouse database.
If we have not added connection strings to Aeros LIVE, we'll need to add those as well.
Once we are configured, we'll need to determine what LIVE calculations we wish to warehouse. Like the House Report Warehouse before this, we'll do that by adding columns named after projection report calculations into the fact table.
A full list of calculations are provided in a script with the installation. We'll chose the ones we want to warehouse and execute those lines of code. we'll have to keep in mind that there are more calculations then there are number of columns supported by SQL, and that the more we add the longer it will take to calculate batches- so the columns we choose not to include are just as important to consider as the ones that we do!
Configuration/Settings → Schedule Tasks
Now that we are configured on the back end we need to create a scheduled task. We'll add and save one with the type "DW - Live Projection Weekly"
Manual Execution
To manually execute a batch, we'll select our schedule record. Power users of the software have the option to "Run This Job". Clicking this option will immediately send a request to the CONNECT web service to run the process as if it were scheduled for that moment. If a record's Frequency is set to manual this is the only way it can be run.
The process does not report results of the jobs, This is something we are committed to adding along the road. We will have to verify that the batches completed by reviewing the warehouse database (see the manage section of the documentation).
If several batches are running, or it takes a while to process, at this point we may get an error on this screen that the request has taken too long to post back. This is a know issue with this form that we have logged and are scheduling to resolve. Closing the application and restarting it will not prevent the batches from completing their execution at the web service.
Scheduled Execution
If we have setup our scheduled task to run at a certain time it will be executed at that moment by the scheduler service. Setting the "Frequency" of our schedule record will make it available to the service.
Process
Report Settings
The Projection report will run with the following wizard settings:
Start Date: The Start Date for the Process_Batch record
End Date: 10 years from that date.
Process Interval: Weekly.
Flocks: All active and pending flocks as of the run date.
Exclude Actual Data: True
This will allow us to calculate projections from the past.
Note: While this was implemented in CONNECT, it is still running the LIVE 1.47 version of the report calculations. We will have to update the installed dlls to version 1.48 to get these results.
When the process starts, it will load all weeks that need to be processed, and all active and pending flocks.
It will verify that the fact table has the required columns General_GroupID and General_ReportStartDate. It will also compare data types to Aeros LIVE report fields. If any of these validations fail the process will quit for all batches and the errors will be logged in the connect log file.
For each week:
Connect to LIVE
Execute the Report Calculations
Scrub the results:
There are dates that will crash the the database if we attempt to insert them (01-01-0001, as an example). We'll scrub these and set them to null.
If a decimal value can't parse to a valid decimal result then set it to zero.
Remove any calculated columns that are not explicitly added as columns to the warehouse.
Delete older records from the same period (call spDeleteFactLiveProjectionWeekly)
Bulk insert the data
Save the Process_Batch record and any messages and warnings that were logged.
Managing the Data Warehouse
At present, there is no user interface for managing the data warehouse. All management must be done through SQL. Here are some common management tasks we'll need to perform against the warehouse database:
Setting the Start Date
The Process table has a value called Process_Start. Every time the software runs, if any date is missing since this first weekend date, it will send it to connect to be processed.
When setting up a new data warehouse we may want to go back and archive a few years in the past. Our first instinct would be to set this date for the week-ending date back then and kick off the process. This may take a very long time to complete! It is recommended, that if we want to archive years past, to instead set this date for a month or so back, run it for four weeks, then set it another month or two back and run it again. This will allow us to calculate out our history without overwhelming the service, and allowing us to keep up with the results in more manageable workloads.
We may want to calculate our history, but a limitation of the system in its present state is that only flocks that are currently active/pending in the software are returned to calculate. We'll have to restore older flocks to calculate their historical data.
Rerunning a Period
If we run the service and calculate a week, proof the work and determine that the data was incorrect or incomplete, it has since been resolved in LIVE, and we wish to update our warehouse, we need only to delete the Process_Batch record for that week. The next time the process runs, either manual or scheduled, it will identify this batch record as missing and queue that week to recalculate.
When we do delete a batch record, it does not delete the flock records that were added to the fact table. These remain until the schedule runs and calls the stored procedure "spDeleteFactLiveProjectionWeekly" before inserting the updated versions. This way, if we need to recalculate but may want to continue proofing the results for other flocks or report against figures that have not been finalize we can continue to do so.
Check for Warnings
If a batch completed, but there are warnings that need to be reviewed by the end user, these will be saved to the Process_Message table, with a foreign key to the batch they were reported for. Common warnings we may see in this table are:
Errors reported by the Aeros Live calculations (the ones we can review when clicking the "Errors" menu item on a LIVE report).
Warnings about data that was scrubbed to avoid a crash of the bulk insert process.
Warnings about columns that were added to the data warehouse that do not have a corresponding calculation in Aeros LIVE.
Proofing the Work
Imagine that we run four separate projection reports for all of our flocks, for the life of each flock. For each week, we run the report for a week-ending date later than the previous. If we lay those reports out side by side we will see one week's less worth of projected data for every concurrent week. Over time, this will provide us with snapshots of every projected period until last week of the life of the flock.
This is how the projection warehouse differs from the house data warehouse. While for the house there will only be one period record for each week in the life of the flock, the projection warehouse has one for every projected week, for every period we run it. We can left-join these by General_GroupID and either Period start date or flock age, to line them up and compare our values for very week that we performed the projection report.