The Boston College Enterprise Data Warehouse (EDW) supports reporting and analytics over a broad spectrum of University data. Developed in 2003, the EDW has advanced in terms of content, information delivery tools and data integration capabilities to provide consistent and reliable information to University decision makers. Using a Kimball-based approach, the EDW consists of fact and dimension tables structured for efficient reporting and authorized management.

Institutional Research & Planning (IR&P) and Information Technology (ITS) work jointly on content development. New subject areas are carefully vetted and structured for ease of use and reliable information. IR&P ensures that good metadata is in place and that the data has been reconciled. The EDW is the University’s largest database and maintains appropriate tooling for ETL, reporting, and administration.

The EDW serves three key functions as a data repository:

  • Historical Records - Structures data in snapshots of time. Most EDW data is loaded on a nightly basis and reporting is possible for any particular date in time. The EDW serves as an archive for core university data.

  • Integration - Allows users to combine disparate datasets. Boston College’s application topology is very decentralized. The EDW serves as a platform to derive information across joined subject domains.

  • Reporting - The tools support both analytical and operational needs. ETL (Extract, Transform, and Load) processes are carefully regulated to give users confidence that the data is accurate each morning.

 

New subject domains have been added to the EDW over time. Student data has always been core to the EDW since the program’s inception. Current data domains are represented in the following diagram:

Data Warehouse content domains

"Students" subject area contains student demographic information, course offerings, course enrollments, student accounts, financial aid, student graduation, and grades.

New major domains often begin as data marts within the EDW environment. This holds true for Finance and Library. These areas are designed with the intent to be integrated to the EDW after reaching a level of maturity, but begin as standalone reporting structures serving specific University audiences.

Primary users of the EDW also manage and serve as its administrators. Each subject domain is assigned a data steward. Responsibilities include authorizing access to the domain for all EDW users and serving as subject matter experts. Operational roles also include responding to issues, planning new features for the EDW, communicating to users, and assisting with upgrades.

The following diagram shows the primary EDW partners. Roles and responsibilities for each are defined below.

EDW Stakeholders and Responsibilities

Primary Data Warehouse partners

 

GroupRoles & Responsibilities
Institutional Research & Planning (IR&P)Major reporting deliverables for the University including staff dashboards, department data, course evaluation trend reporting, and surveys.  IR&P administers quality checks and metadata definitions for all new subject areas. IR&P works with ITS on initiating information governance and data literacy efforts for the University.
Student ServicesOperational and analytical reporting, information governance, and data stewards for student subject areas.
Undergrad AdmissionsOperational and analytical reporting for applicant information and data steward responsibilities
Graduate AdmissionsEach of Boston College’s eight schools has a data steward assigned for student and applicant reporting for their respective school.  Reporting is available for applicant and student information primarily.
Enrollment ManagementDeveloping successful admission and retention models drawn from applicant and student data is strategic for the University.  The EDW supports the analytic requirements from this team.
Information Technology ServicesITS is responsible for the technical architecture and implementation of the EDW.  ITS also helps develop information governance tools and processes used for the EDW and responds to EDW audits.  ITS mentors and provides training for reporting but has a limited role in delivering business content.
Human ResourcesEmployee data is supplied by HR systems and used by IR&P and other groups for institutional studies.
FinanceCurrently a standalone data mart within the EDW environment, core financial data is available for reporting and analytics. This is a multi-year phased project.
LibraryThis also is a standalone data mart within the EDW environment with plans to integrate to EDW in the future.  This is the first higher ed data warehouse for the leading Cloud-based vendor supporting university library services.

 

Much work lies ahead on integrating data. The EDW has made information delivery more accessible but understanding relationships across data domains remains a key goal. How the student experience impacts our university’s finances, how effective library resources translate to student success, and how alumni benefit from their Boston College education are all examples of strategic questions to be answered by the next phase of this program. To complement this work we will be exploring additional visualization tools and artificial intelligence solutions.

University transactional systems (systems of record) continue to evolve and improve. How to manage historical records across major changes is challenging. The EDW must adapt to these changes and continue to serve as a reference point for us to learn from our past. The Cloud has offered a new set of data integration challenges and iPaas and other Cloud-centric technologies must become part of our technical portfolio. Governance becomes increasingly complex as data becomes integrated. Developing an enhanced security model and collaborative data stewardship will be part of this future roadmap.