Skip to main content

Data Lake (RAE)

Ohio State’s Reporting and Analytics Environment (RAE) is a centralized data lake style system that provides a place to capture, organize, enhance, and query core information about the university’s business processes.

The RAE holds data needed for both extended and detailed analytics. The data lake also provides capabilities for university areas to store and share data as well as allowing data analysts to bring additional datasets into the system for deep, cross-functional analytics. 

 

New Amazon Redshift ODBC Driver version 2.x

May 2026 Recommended Upgrade Steps

Completing the upgrade in the order listed below reduces risk, keeps your configuration intact, and ensures continued access to RAE. Be sure to install version 2.x before removing version 1.x

  1. Download the Driver
    Download the latest 2.x driver from the Ohio State Software Center or directly from Amazon.
  2. Install Version 2.x
    Install the new driver without removing version 1.x.
  3. Rename Your Existing Connection
    In the ODBC Data Source Administrator, rename your current connection:
    • From: OSU_RAE_PRD
    • To: OSU_RAE_PRD_old
  4. Create a New Connection
    Create a new ODBC connection named OSU_RAE_PRD.
    Select the 2.x driver and copy over the settings from the previous connection.
  5. Test the Connection
    Confirm that the new connection works by testing it in Microsoft Access or your reporting tool.
  6. Remove Version 1.x (After Successful Testing)
    Once everything is working correctly:
    • Delete the old ODBC connection
    • Uninstall the 1.x driver

Please complete this upgrade by May 1, 2026, to ensure uninterrupted access to RAE.

 

What data is in the RAE?

Use the public RAE Object Domain Directory dashboard to learn more about what data is in the RAE. With the dashboard you can see the different domains (HR, Finance, etc.), specific tables, and permissions needed to access the data in the RAE.

In June 2025, student data was transferred from the Operational Data Store (ODS)/DWHCRPT to the Reporting and Analytics Environment (RAE)/AWS Redshift. Now that this transition is complete, the ODS has been retired, and all student reporting data is now stored in the RAE.

Request Data Lake (RAE) Access

Learn how to request Reporting Access or Direct Access.

Connect to the RAE

Learn how to configure a client to connect after receiving credentials.

Components of the RAE

Data Storage

Data lakes are often used to consolidate all of an organization’s data in a single, central location, where it can be saved “as is,” without the need to impose a schema (i.e., a formal structure for how the data is organized). This helps us to avoid lock-in to a proprietary system like a data warehouse, which has become increasingly important in modern data architectures. Data lakes are also highly durable and low cost, because of their ability to scale and leverage object storage.

Data Ingest

There are a number of tools and capabilities that come into play for loading data into a data lake. We’ve designed the system to accept data from other databases, web APIs, and file inputs. Currently, we support both batch (in frequent) and micro-batch (near time) data feeds from sources across Ohio State, including Workday, SIMS, Salesforce, and Peoplesoft SIS. In the near future, we will also offer support for streaming datasets (real time), as well as unstructured data inputs.

Optimized Data

While many analysts are comfortable working with “as is” data, it is best practice to apply learnings to the data to make other analysts work much easier. Thus, we work with data analysts and engineers to apply well designed and vetted transformations that will increasingly make the data more usable and accessible for analytics needs. As data goes through each progressive transformation, we categorize these “new datasets” using the Medallion Data Architecture nomenclature: Bronze, Silver, & Gold.

Data Content

The Enterprise Data Lake team is constantly working with data partners around campus to acquire, store, and make datasets available to analysts. The RAE’s primary focus is to capture data about Ohio State’s key business processes and systems. We do not store information or data related to OSU’s research mission, nor data from any OSU Medical Center systems (eg. EPIC). To learn more about the data we have currently loaded into the RAE, you can view the RAE Object Domain Directory Tableau dashboard. With the dashboard you can see the different domains (HR, Finance, etc.), specific tables, and permissions needed to access the data in the RAE.

In June 2025, student data was transferred from the Operational Data Store (ODS)/DWHCRPT to the Reporting and Analytics Environment (RAE)/AWS Redshift. Now that this transition is complete, the ODS has been retired, and all student reporting data is now stored in the RAE. View more technical detail related to the tables used for student data in the RAE.

Target Audience

The RAE forms the hub for data storage of datasets that analysts use daily to derive insights about Ohio State’s business operations and systems. The core user of the data lake will be someone that is skilled in: 

  • Relational databases
  • SQL query writing
  • Working with “as is” or “raw” data, including building your own transformation and join logic
  • Building insights from the ground up using SQL queries and/or tools like Tableau, python, R, and SAS 

As we previously mentioned, as data gets progressively cleansed and more aligned to a “final business view”, analysts that may not be as skilled in the above capabilities, may only referenced the RAE to utilize pre-built Silver or Gold layer data objects, sourcing them into tools like Tableau for dashboard building and reporting.

Technology Set

Today, the RAE data lake is powered by components within OSU Amazon Web Services that are combined into a customized system implementation. The core AWS components utilized by the RAE are: 

  • Redshift
  • Glue
  • Lambda
  • S3 

You can read more about AWS and its analytics tools and capabilities.

 

Security

To maintain security of the data within the RAE, the database is behind a specific VPN-connection profile. Users that have been granted access to the RAE will also have been placed into the appropriate VPN configuration. To connect to the RAE, follow the steps on the Connecting to the RAE guide.

Need Help?

If you are having technical issues with connecting to or querying the RAE, please contact the IT Service Desk.

FAQs

What is the RAE?

The Reporting and Analytics Environment (RAE) is both a place to source and store data. It will provide an environment for data analysts interested in creating reports and performing analytics with cross-functional datasets. The RAE will house historical data, as well as data from other systems that will not be converted to Workday.

How is the RAE related to the implementation of Workday?

Workday’s robust reporting environment helps leverage Ohio State data quickly and easily for operational decisions at all leadership levels. 

Much of the data that now resides in local data marts distributed across campus will be brought together in Workday and the Reporting and Analytics Environment. Teams in different units who rely on these data for their units’ processes will access them from Workday and the Reporting and Analytics Environment. Historical data from local data marts can be included in RAE.

What tools can I use to access data in the RAE?

Data in the RAE can be accessed using Tableau Web (using Enterprise Tableau Data Sources), Tableau Desktop or any SQL-capable desktop tool (such as DBeaver).

How can I connect directly to the Redshift tables available to me?

If you have been granted access to any Redshift tables in the RAE, you can access them with a SQL navigation program. DBeaver is the selected tool of choice for accessing the RAE, however, you can still use your current software. You will need to update your settings in your SQL navigation program to be able to connect. Instructions on how to install DBeaver and what settings to use are located in the OTDI Knowledge Base.

I need a contact person for additional help with Workday reporting and historical data.

Contact information for reporting leads and business area representatives is available in the Administrative Resource Center to help with your data and Workday reporting needs. Log in to the ARC with your Ohio State credentials.

If you still have questions about getting historical data or reporting help after consulting the information in the previous link, please contact the Service Desk by calling 614-688-4357 (HELP) or emailing servicedesk@osu.edu and ask that your question be routed to the “Administrative Services Data Warehouse” group.

Who should I contact for RAE technical support?

If you are having technical issues with connecting to or querying the RAE, please contact the IT Service Desk.

I am having issues with MS Access. Are there tips for converting my MS Access to use the RAE?

ODBC Connection Name: In coordination with the EARI team, we are now recommending that MS Access users name their RAE ODBC connections as OSU RAE PRD. Using this standard connection name will facilitate sharing of Access databases across different units. EARI is in the process of redesigning their job aid databases for the RAE and these databases will expect your ODBC connection to the RAE to be named OSU RAE PRD.

MS Access Question – Primary Keys: There have been several questions regarding primary keys in the RAE. The backend database system used for the RAE (AWS Redshift), does not support primary keys. We have added a table to the RAE named “data_student_brz.dwhcrpt_primary_keys” with a complete list of the primary keys for each table, which can be used as reference. Note: When manually linking RAE tables in Access, make sure NOT to set any primary keys when prompted. See next item for more details.

MS Access – “#Deleted” issue: We have received several reports of MS Access databases where all the values in a query result or table show up as “#Deleted” when connecting to a RAE linked table.  Thanks to a tip from one of our users, we have found that the issue only occurs if the primary keys are manually set when linking a table. If you are manually linking to tables in the RAE, make sure that you do not choose any primary keys for the table if prompted by Access. We have confirmed that this is a known issue with certain types of fields in MS Access.

Is there an update to MS Access that we can use with the transition to RAE?

There is a new editor called Monaco SQL Editor for MS Access. It is a major quality-of-life improvement for editing SQL within MS Access, but it is completely optional at this time. Most university PCs are on a monthly update cycle for Office365 and have not yet received the updated version of MS Access. We expect it will be generally available in a few months. If you would like to try the new Monaco editor before then, you can see if your desktop support team can assist you in upgrading early by switching to the “Office365 Current Channel.”

Is there any new student data in the RAE?

In addition to the new curated cumulative datasets, you will also have access to daily historical change tables for most ODS tables and custom objects.

What will happen to the data in the data mart (DWDMOSU)?

There were two basic categories of data in the datamart (DWDMOSU):

  1. Snapshot tables (tables that start with SS_)
  2. Cumulative datasets (just OUR_OSR_CUMCENSUS and OUR_OSU_CUM_CPPS)

The tables that started with "SS" in the ODS are in the data_student_ss schema in the RAE. 

Permissions to these tables in the RAE is the same domain as the source table so data_student_ss.SS_PS_OAD_APPL_DATA will be in the same domain as data_student_brz.PS_OAD_APPL_DATA.

There are 2 other tables that were in the DWDMOSU, that are now in data_student_gld - OUR_OSR_CUMCENSUS and OUR_OSU_CUM_CPPS . Those are both part of the non-sensitive enrollment, records, and curriculum domain.