Recap of Reporting and Analytics Environment v1.0
As the Workday implementation and go-live activities continue, the Data and Analytics team has been deploying v1.0 of the Reporting and Analytics Environment (RAE). This new system has been running in beta since our soft launch in August 2019. With data now available from Workday, we are excited to share some information and stats about the new analytics environment.
We look forward to working with data and business analysts across the organization, creating new and exciting reports and dashboards, and bringing insights to Ohio State management and leadership.
The RAE is a data lake style analytics system, created using Amazon Web Services (AWS) cloud technologies. By using AWS, we are table to take advantage of cloud design patterns, including serverless processing, pipeline and systems automation, on demand cost model (per minute processing), and flexible technical infrastructure. Each of these patterns will enable us to grow and flex the RAE into a data analytics powerhouse for the university.
The immediate focus of the team is to ensure that all processes are running optimally and to react to data needs and issues raised during Workday go-live and the weeks immediately following. As we move into 2021, the team will continue to add more Workday datasets and also begin working with other teams and units that have expressed interest in storing their data in the RAE. Our goal is to have the RAE serve as a cross-functional data store, where analysts and data teams across the university can explore, design, and build innovative insights. We are well on our way to achieving this goal!
If you have any questions about the RAE, please do not hesitate to reach out to me or the Data and Analytics Team. You can visit the data.osu.edu website for more information.
Feel free to share this information with your colleagues and others that may have an interest in the RAE.
v1.0 RAE Information and Statistics
- The RAE is currently loaded nightly from five enterprise systems:
- Workday (FIN, HCM, and SCM)
- SIMS
- Salesforce (supporting prospect and recruitment reporting)
- IDM
- ServiceNow
- Peoplesoft Campus Solutions (supporting a limited set of Enterprise Project R1 reporting needs)
- As Peoplesoft and other systems are retired, the RAE will also store data from those applications to facilitate historical reporting:
- Historical data loads start this weekend and will complete in the April 2021 timeframe
- Core systems data being archived in the RAE:
- PS Finance, HR, and eMaterials
- PI Portal
- OHR Datamarts
- BuckeyeOasis Systems (eTravel, eRequest, eLeave, etc.)
- OCIO Data Warehouse and Marts (FinDW, GL Analytics, Employee Analytics, etc.)
- Student Life FDS
- Multiple local datasets and files from MS Excel and MS Access
- Information & metadata about RAE datasets are captured and ready for annotation in the new Ohio State Data Catalog: Collibra
- More information about Collibra will be broadcast to the OSU community during the week of January 18th! Stay tuned!
- Key stats:
- 92 datasets are loaded from Workday nightly
- 129 total dataset loads are processed each night
- 44 higher order datasets are rebuilt nightly to support enterprise reports and other analytics
- 323+ million total rows of data have been loaded thus far
- 100+ GB of data ingested into the system
- NOTE: This is prior to the historical loads which will occur this weekend
- Average nightly load time = 5 hours, 13 minutes
- Most loads begin at midnight
- This is roughly on par with the legacy nightly batch processing timing