Using ML to Predict Attendance at the School of the Art Institute of Chicago
As a multi-disciplinary art school ranked #2 in the nation, the School of the Art Institute of Chicago (SAIC) is a competitive school offering enrollment to approximately 3,500 students a year. This comes with a rigorous enrollment process, one that SAIC wanted to enhance even further through data analysis and machine learning (ML).
That’s when the admissions team at SAIC heard about an opportunity to do so, and hopefully bring opportunities to art students no matter their socioeconomic background. In 2022 as part of its 50th anniversary celebration, SPR unveiled it would be donating $50,000 of ML services to an organization looking to improve the local community. SAIC applied and was selected.
See the Story
Hear directly from the client how we helped SAIC better predict student attendance.
The SAIC project centered around student application and enrollment, and ways to use ML to better understand which applicants are most likely to enroll.
Working closely with subject matter experts and data experts at SAIC, SPR delivered a machine learning model that the SAIC admissions office used – and can continue to use – to better predict which students that were admitted will actually enroll at the school. In turn, by better predicting what students will enroll, they can provide those applicants with access to scholarships that might assist them in the admissions process at SAIC.
“What we believe at SPR is that we're helping line up the students with the need and the students with the artistic talent to actually come to the school,” said Matt Mead, CTO at SPR. “This ML project allows the school to make better use of these dollars and attract the right students, and probably give an opportunity to a student that wouldn't otherwise have an opportunity to go to a world-class college like SAIC.”
SAIC constrained the project to one simple question: If a student is admitted, will they enroll?
Like many organizations, SAIC had been storing data over the years and wanted to extract actionable insights from that data. Thankfully, SAIC’s data science journey was quite sophisticated: “It helped us more quickly get to the point where we were able to create value from their data based on the prior work they had done,” Mead said.
This was due to work done by the SAIC team, including Kyle O’Connell, Director of Enrollment Analytics and Forecasting at SAIC. In fact, several years ago, O’Connell had independently started an ML project: “At the end of the project, I learned that the data that would be really helpful to us hadn't been collected, and we were missing some key parts,” O’Connell said. The team made modifications and started collecting additional pieces of information, but then had to wait several years to accumulate enough data to be useful in the model.
That’s when O’Connell heard about the SPR giveaway. He was excited to work with SPR because, while he had years of data experience, he needed to fill in some gaps with ML experience to advance the project.
Sourcing the Data
In a typical ML project, the data team begins by organizing a set of data. SPR onboarded several SAIC data scientists onto AWS SageMaker, then moved data into AWS to begin analyzing, cleaning, and creating models. By onboarding everyone into one cloud platform, both teams – SAIC and SPR – were able to collaborate seamlessly and share data in a way that would have been difficult if teams worked independent of each other on their own platforms.
The data sources came from information in a student’s application, including basic demographic information and the applicant’s artistic portfolio. Additionally, the team added other data sources, such as:
- Country of origin
- High school location
- Webinars the student attended
- Other interactions with the student during the admissions and enrollment periods
Engineering the Data
The team then began manipulating the data, an iterative cycle to improve data inputs, as well as how the algorithm interprets the data – all aimed at producing the most accurate model.
“Those things have to be manipulated a little bit so that the algorithm can start understanding them in numbers,” said Steven Devoe, Director of data analytics at SPR. “From there, you start to understand which factors – like high school or locations – start to determine the influencers to either enroll or not enroll. You work on refining those to make your model as accurate as possible.”
Data engineering is 80% of any data science project, says Kevin Young, Data Engineer at SPR. “What data engineering means is we’re bringing the data in from multiple data sources, often formatted in different ways, and putting that data together into one dataset and in a format that can be fed into a machine learning model.”
Once the team completed this process, they had two choices:
- Leave the model as is and run it less frequently, or
- put it into a production environment where it's making real-time predictions every second.
Structuring the Project
The SPR and SAIC teams structured this project as a binary classification problem, meaning there’s only two paths a student can go down: They either enroll with SAIC or they don't. “SAIC has a pretty good enrollment rate, where around 15% of students who are admitted ultimately end up enrolling,” Devoe said. “But in the machine learning world, you really have to consider that because the algorithm will start to skew towards more students not enrolling rather than enrolling, because the dataset isn't 50/50 on the inputs.”
O’Connell said that by putting the information through the machine learning model, it crushes it back down to identify where things are similar, valuable, different, or the same. “It sifts through that for us in a way that would be difficult to do with more standard tools.”
According to O’Connell, SPR delivered two important things:
- The narrative insights to share with the team, and
- The code to merge various sets of data together and repeat the exercise in different situations.
“I might be able to run the experiment or extract those insights every week, every day if I needed to,” he said. “Without SPR, that wouldn't be possible.”
ML helped narrow down all the data to 10 fields that were impactful. “Even to a trained eye, if you looked at those aside from the probability of scores assigned to them by the model, you would have trouble telling the difference between them,” O’Connell said. “In that way, machine learning makes what's invisible, visible.”
The outside perspective also proved invaluable to O’Connell. SPR was able to point out areas that O’Connell made assumptions about because he has been at SAIC for many years.
The ML project gave the admissions team information that helps them make decisions about student acceptance easier and faster. “Everyone in our entire department gets smarter with this data because it relieves all of us to do other things,” said Rose Milkowski, vice president for enrollment management at the School of the Art Institute of Chicago.
Milkowski said this project helped her team have more meaningful conversations with prospective students. The team can then streamline and focus their outreach to students who are deemed more likely to attend after receiving an acceptance letter.
Working with SPR
O’Connell said he appreciated SPR’s expertise and collaboration. “SPR were experts in areas that I was not,” he said, such as data engineering and making choices about grading the model’s accuracy. “I appreciated being able to talk to a team that works on this fulltime, and works on it with a range of different clients,” O’Connell said. “I was able to ask them questions and they took concepts or ideas and could tell me, more broadly, better ways to apply those things.”
Additionally, SPR went above and beyond to make sure the project was done correctly. “During the process, we added some information and they were willing to do it,” Milkowski said, “because they wanted to get the job done right. It wasn't just checking a box and doing this pro bono effort for us. They wanted to get it done right, and they wanted to give us really good information.”