Should the Federal Government Rate Colleges (and Universities)?
President Obama, Secretary Duncan, Deputy Under-Secretary Studley, and others, have called “foul” on those of us opposing the proposed ratings system since it does not exist. Their position is that our response should be constructive and supportive until there is something to criticize.
They don’t understand why many of us, the data experts, are opposed.
It’s not that a ratings system can’t be developed. It can. At issue is the quality and appropriateness of the available data. The existing data are simply inadequate for the proposed task. IPEDS was not designed for this and organizations like US News & World Report have taken IPEDS as far as it can go and added additional data for their rankings. Also at issue is the appropriateness of the federal government providing an overall rating for an institution over aspects for which it has no authority.
I think it is fully a right course of action for the Department and Administration to rate colleges as to their performance with Title IV financial aid funding and required reporting. Ratings based on graduation rates, net price, and student outcomes of students with Title IV aid would be very useful and appropriate. Placing additional value factors in ratings relevant to compliance and accuracy of required reporting under Title IV would add meaning to a system designed to determine continued eligibility for Title IV programs.
After all, if continued eligibility and amount of available aid under Title IV are the ultimate goals, aren’t these the things to measure?
This is decidedly less politically exciting than saying Institution A has a higher overall rating than Institution B, but it makes a clear relationship between what is being measured and what matters. It also has the advantage of being able to use existing student-level data on Title IV recipients in the same manner as is being done for Gainful Employment. From where I sit, PIRS is simply Gainful Employment at an institution-level as opposed to program-level.
And that is appropriate.
Using ratings to develop performance expectations to participate in the federal largesse that is Title IV would be a good thing. Regional accreditation and state approval to operate is clearly no longer adequate for gate-keeping, if, indeed, it ever was.
The difficulty is determining what those expectations should be. It is quite reasonable to subdivide institutional sectors in some manner, calculate a graduation rate quartiles or quintiles for each group of students in Title IV programs and require institutions in the bottom to submit a five-year improvement plan with annual benchmarks. Any institution failing to meet annual benchmarks two years running could then be eliminated from Title IV. Using multiple measures of success, including wage outcomes from records matched to IRS or SSA, we can reduce any tendency towards lowering standards to survive.
In an ideal world, with quote complete end quote data, a ratings system would be focused on intra-institutional improvement. In fact, this is the language that Secretary Duncan is beginning to use, as he did in a recent interview:
“Are you increasing your six-year graduation rate, or are you not?” he said. “Are you taking more Pell Grant recipients [than you used to] or are you not?” Both of those metrics, if they were to end up as part of the rating system, would hold institutions responsible for improving their performance, not for meeting some minimum standard that would require the government to compare institutions that admit very different types of students.
The problem is that simple year-to-year improvement measures tend to not be very simple to implement. We have substantial experience with this in Virginia, especially this week, as we work through institutional review of performance measures in preparation for next week’s meeting of the State Council. On any measure, annual variance should be expected. This is especially true for measures that have multiple year horizons for evaluation. It is even truer when institutions are taking action to improve performance as sometimes such actions fail.
A better approach is focus on student sub-groups within an institution. For example, is there any reason to accept that Pell-eligible students should have a lower graduation rate than students from families with incomes greater than $150,000? We generally understand why that is currently the case, but there is no reason to accept that it must be so. I would argue, vociferously, that if the Department’s goal is to improve to access and success, that this is where the focus belongs. Rate colleges on their efforts and success at ensuring all students admitted to the same institution have the same, or very similar, opportunity for success. Provide additional levers to increase the access of certain students groups to college. To do this would require IPEDS Unit Record – a national student record system – perhaps as envisioned in the Wyden-Rubio bill, The Student Right-to-Know Before You Go Act.
This means over-turning the current ban on a student record system. It also means taking a step that brings USED into a place where most of the states are. From my perspective, it is hard to accept an overall rating system of my colleges from the federal government when I have far, far more data about those colleges and choose to not to rate them. Instead we focus on improvement through transparency and goal attainment.
I think few reasonable people will disagree with the idea of rating institutions on performance within the goals and participation agreement of Title IV. It is when the federal government chooses winners and losers beyond Title IV that disagreement settles in.
We will face disagreement over what standards to put in place, if we go down this path. That is part of the rough and tumble of policy, politics, and negotiated rulemaking. You know – the fun stuff.
Let’s take a quick look at four very different institutions. These images come from our publicly available institution profiles at http://research.schev.edu/iprofile.asp
Germanna Community College does not have high graduation rates (note these are not IPEDS GRS rates as they include students starting in the spring as well as part-time students). All of these are toward the lower end of the range of Virginia public two-year colleges. There are a range of differences among the subcohorts, particularly between students from the poorest and the wealthiest families.
Even at the highest performing institution on graduation rates, one of the highest in the nation, there is still a range of difference. A full 10 percentage point difference between the poorest and wealthiest students.
In the last two decades, CNU has more than doubled its graduation rates through transforming the institution and its admission plans. The differences between subcohorts are much smaller, but this has come at a price of denying access to students that sought an open-enrollment institution.
Ferrum College has relatively low graduation rates and high cohort default rates. Using federal data, it does not look to be an effective institution. However, I will point out that it has the highest success rate with students requiring developmental coursework in the first two years. It apparently can serve some students well, and others better than other institutions.
My point with these four examples is this. We need to drive improvements in student outcomes by focusing on differences within institutions, specifically subcohorts of students that are recipients of Title IV aid.
Pingback: It is a Risky Business
Pingback: It is a niche series of arguments and posts | random data from a tumored head
President Obama’s higher ed advisor (James Kvaal) promised a “datapalooza.” I think he meant that was a good thing, but I fear it is not. What we need is thoughtful analysis =– which I have now found on Tod’s GAO posting
Thanks John. The way things typically go with these data events is that there is first a “data jam” to brainstorm and spec out possible products/applications. I’m involved in two such efforts now. The datapaloozas are normally held 60-90 days following that data jam so that actual products/pilots/commitments can be presented. Until USED presents a draft rating system, I don’t think the data jam can happen, which further delays a datapalooza.