Posts Tagged ‘data’

Horses and water…

June 30, 2015

Every year a certain amount of Parliamentary time is dedicated to Private Members’ Bills. These are opportunities for individual MPs to present legislation on practically any topic. This year’s crop illustrates the wide variety – we have bills on riot compensation, exemption from hospital parking charges for carers, and Highways (Improvement, Traffic Regulation and Traffic Management). Amongst the list is the Higher Education (Information) Bill being presented by the Conservative MP for South Cambridgeshire, Heidi Allen.

The summary of the Bill indicates that it is “to require information to be made available to prospective undergraduate students about what is provided to students for the tuition fees charged, how tuition fee resources are expended and what is expected of students; to establish transparency in how tuition fees are spent; and for connected purposes”.

There are a number of potential issues here and I’ll say now that I am not privy to what will be included in the Bill so all of this is guesswork. We’ll hopefully know more when the Bill gets its second reading on 23 October. Firstly, there is the level of detail required – are universities going to be expected to provide such detail on a course by course basis? There would seem little value in the Bill if it didn’t and so the assumption is that it will be expected that there will be different costs for different courses. So costs such as Estates, IT, the Library and indeed academic time will need to be apportioned across different courses. The cost to universities in calculating and providing this information will increase depending on the level of accuracy required.

If, on the other hand, all that is required are aggregate costs then there seems to be little value to the applicant. There will still need to be some apportioning of costs (how much academic time is spent on teaching activities compared with research for example) and as such institutions will need to justify how they have split such costs if they are to avoid accusations of top loading the tuition fee spend.
Regardless of the level of detail, there will be a cost to the universities of putting this information together. However, I would question how much use applicants make of the information that is already available to them. I know of several 18 year olds that are (hopefully) heading to university in September and none have made use of the Key Information Set in making their choices. I realise there is an element of horses and water here but is there any evidence that providing this information will really lead to a significant number of applicants being influenced by how their money is spent?

I am not against transparency and I do believe that there is value in demonstrating how income from course fees is spent. However, I am not convinced that there is a strong business case for providing such information, nor do I believe that it will radically change the way that applicants make their choices. Regardless of the level of detail provided, there will be a cost to provide it, as there is a cost of meeting the requirements of other legislation. How many institutions will be open enough to include a line detailing the cost of providing the information required?

Advertisements

Shaping the information landscape

February 5, 2015

One of UCISA’s roles is to ensure that suppliers to our sector are kept abreast of developments that may impact the software and services they deliver. The aim is to alert suppliers of potential changes in legislation or other statutory requirements so that they can effectively plan future developments. A recent example of this activity was the briefing day that UCISA and HEDIIP arranged at the end of January to bring suppliers of student records systems up to date with the work being carried out under the HEDIIP programme.

The meeting heard updates on four of the HEDIIP projects: data capability; the new subject coding system, the Unique Learner Number and the new Information Landscape. In addition we heard from HESA about the CACHED project. The aim of the HEDIIP programme is to redesign the information landscape to enhance the arrangements for the collection, sharing and dissemination of data and information about the HE system. Each of these projects will contribute to that overall goal – I won’t go into detail on these here but if you are interested in learning more, each is outlined on the HEDIIP website.

There were a number of common themes that emerged from the day. The first was the adoption of standards. One of the challenges the sector faces currently is that the same term can mean different things to different organisations (the term course being a prime example) so standard data definitions are essential to a common understanding and data sharing. This has been a particular problem with the JACS subject coding scheme where changes and growth in JACS’ range of functions mean it is no longer consistently applied.

The second theme was managing cultural change both within higher education institutions and a number of the organisations requesting data from the sector. In some institutions, many processes are geared around producing the HESA return and the need to get it “right”. The focus on a single return suggests that these institutions may be unaware of the volume of demands made on their data and the amount of resource across the institution spent in ensuring the various returns made are correct. It is highly unlikely that there will be one version of the truth in these institutions – indeed it was noted that one institution had over 200 separate collections of student records. It goes without saying that the data management in such institutions is poor – it will take a significant change to move away from data being an input to deliver a return to a point where it is seen as an institutional asset.

Finally, the biggest challenge is governance. At an institutional level, mature data management will only be achieved with effective information governance being driven from the top table. Getting the value of data understood at senior management level is key to improving the data and information management within an institution. There are wider governance issues that the HEDIIP programme will need to address. Moving to a set of standard data definitions is one challenge – ensuring that the governance mechanisms are in place to ensure that the standards remain consistently applied and understood is a league apart. Similarly with the new subject coding scheme, establishing a governance model that is supported by an appropriate selection of stakeholders, with sufficient authority and resources to manage its evolution will be critical to the success of the new scheme.

The feedback from those suppliers present was positive. They could recognise the efficiencies in moving to a model where, for the most part, data is submitted to a single point at various points in the year and drawn down from a single repository. The HEDIIP programme is only part of achieving this goal – the institutions need to improve their data management and change their processes, those requesting data may also have to change their processes and suppliers will need to amend their systems to implement new standards and enable data to be extracted at key points in the academic year or cycle. It will be a long journey but one that offers much reward.

Supporting students with analytics

November 14, 2012

One of the topics that cropped up several times at this year’s Educause conference was analytics. This has been something of an emerging topic for some time and something that I don’t think we have really developed here in the UK, although next week’s CISG Conference may prove otherwise.

In the first case, an institution had identified a specific problem and has made use of the data it had to offer a solution to its student population. The problem was that many students select modules or pathways that are not suitable for them. This affects the completion rate as students who are on courses that they are not suited to often drop out or fail. As the level of completion was one of the criteria used to establish funding for the institution concerned, finding a solution would have a financial benefit.

One reason for students selecting the wrong options is that the language used to describe modules is often opaque – the suggestion was that the descriptions might just as well be in a foreign language for the amount of sense they made to the undergraduate trying to select their degree pathway. The solution was a course selection programme that made use of historic and personal data to recommend the modules that would best suit the students and so offer them the best chance to succeed. The grade information from all students over time was combined with the entry grades for the individual student to predict the grades for the student’s chosen path. The system then uses the course plans and predicted grades to make recommendations to the students to assist them to choose. However the student is not left to his or her own devices to make the decision. Rather the system supplements face to face advice and helps inform the discussion with advisors. There was some concern that such a system would lead to greater standardisation in the course pathways but, in the same way that Amazon or the iTunes Genius bar offer unexpected choices, it has sometimes broadened horizons. The predictions made as part of the recommendations have proved to be very accurate and that accuracy will improve further as more and more data is collated and used. And the system is delivering results – those students taking the modules recommended to them are getting better grades which in turn are providing greater incentive to those students to return. The student is also given advice on the potential careers that their major may lead them to which may also influence the student choice.

The system builds on the recognition of the fact that students have particular strengths and weaknesses and as such are going to be better suited to some paths over others. This was a problem identified in the Italian higher education system a few years ago; one of the main reasons for the high drop out rate in Italian universities was simply the fact that students were taking courses that were wholly inappropriate for their personal traits. The introduction of pre-admissions profiling resulted in students taking more suitable courses for their personalities and analytic capabilities with a consequent improvement in retention rates.

Another institution had created a student dashboard that facilitated evaluation of both a cohort and individual students’ learning. Again the system made use of both historic and current data to help guide the student towards the resources they needed according to their learning preferences and styles. The dashboard was used by the student to identify areas where they need further work in order to improve their grades, and by faculty to identify areas where the lessons delivered in their teaching had not been absorbed by their students. Consequently they were able to make changes to their courses to improve understanding and hence student success. The intention was to move to a personalised environment where a body of historic and personal data is used to understand an individual’s learning style and so better guide them to what they need to succeed.

Finally, one institution made use of a wide range of current data to assess which students were at risk of dropping out. This is common application in the UK where data is pulled from diverse sources such as the coursework submission system, library access gates, and VLE login data to pick up those who are not engaging with their university. There was a difference in the application however. The US institution analysed the data and put the students into three categories. Interestingly rather than focus on the lower tier (ie those most at risk of failing), efforts were concentrated on the middle tier to ensure they remained and improved their results. This contrasts with the approach adopted by UK institutions where support is given to the most at risk group to try to ensure they continue in their studies. This perhaps reflects the different drivers – in the US, some institutions are funded according to the numbers of successful graduates so it makes sense to put more effort into those who are likely to stay but need support to graduate. In the UK, institutions are penalised for high drop out rates and so the emphasis is on retaining those students at risk of dropping out.

Students paying higher fees are likely to expect far more in terms of support. The data institutions have built up over the years can play a role in supporting students but there is a cost in developing systems to analyse that data and tailor it to individual circumstances. The driver for institutions to invest may well emerge as the quality of student support becomes an area to seek competitive advantage.

Improving the business (part 1)

November 27, 2009

A couple of the presentations at the CISG conference last week focussed on what could be done to deliver improvements within the institution. The first of these, from the University of Exeter, concentrated on a project to improve the quality of data within the institution. There are consequences of poor data, particularly in the higher education sector where league tables are important factors for both staff and student recruitment. The interest in league tables is highlighted by the million hits on the Times Higher’s website after they published their world league table of institutions. There is also an operational cost of poor data. Firstly institutions spend a considerable time spent correcting data, particularly getting it right for statutory returns. Secondly there is a risk that poor data may result in incorrect decisions being made or that additional systems are developed to deliver management information as the core systems data is not trusted.

The Exeter project had two strands – correction and prevention. When correcting the current data, they looked to establish why there was a problem and identify the issues that were causing erroneous data to be created. The staff responsible for the data were involved throughout the process as errors were identified and corrected. However, errors will still creep in if staff do not understand why the data they are entering is required and what the consequences are if data is incorrect. Consequently there was an extensive training programme to explain what the data is used for and to highlight the implications of incorrect data. Training on the use of systems is now augmented by training on the use of the data within them.

The project recognised some key problem areas. One of these was around the data required for the HESA return and postcode data in particular. As a result of the project Exeter now employ a temp for 3 – 4 weeks following registration to clean student data. This has resulted in a big drop in postcode errors and although it has a cost of around £2k it has resulted in a £30k saving in staff time later in the year.

The need for data quality is something the Data Efficiency Group formed by HEFCE is striving to achieve. This has highlighted that few institutions have someone on their Executive Board that has responsibility for data; the then Chief Executive of HESA noted that Vice-Chancellors were only aware of problems at their institutions when HESA alerted them. Exeter clearly have senior ownership of information – in order for this project to succeed it needed senior management buy-in from the top but also the investment in training at the operational level. It strikes me as a model others could follow.

Importance of data quality

October 2, 2009

I attended a meeting of the Data Efficiency Steering Group yesterday. The Group was set up to look at recommendations in a report by KPMG on the use of collected data in the higher education sector. The meeting is an interesting mix – on the one hand are the funding councils as the main stakeholders in the data collection and on the other are the various professional associations representing the views of those tasked with collecting, processing and using the data at an institutional level.

There are conflicting demands on the funding councils – they are looking to reduce the administrative burden of collecting data but are under pressure to demonstrate that the sector is meeting its statutory obligations. They also need to work together to achieve a consensus so that comparisons can be made across the whole of the UK and local needs met without ending up with a hotch potch of collections or one unwieldy one. These conflicts were brought into focus when talking about monitoring external examiners to ensure equal opportunities. One funding council representative expressed the view that external examiners should not be included; the sole responsibility was for the institution to ensure that they were appropriate in terms of their academic credibility. Another view was that they should be regarded as employees and included in monitoring. Against this, the institutions highlighted that data on external examiners was not held in a consistent way across the sector – in a number of cases there was no record of such positions on HR databases and monitoring would require an additional administrative burden.

In addition to ensuring that the data collected is appropriate and fit for purpose, there is a second stream of activities aimed at promoting the value of the HESA data itself. Often this is seen as a burden in institutions and there is a lack of senior management ownership of the data. As a consequence, the importance and impact of the data are not always clearly understood in institutions (the consequences of poor data quality only being highlighted with performance indicators or league tables are published). Some institutions are however focusing more on the data they collect and have to submit. As a result they are starting to use the data more effectively within their institutions to highlight problems, and are improving the quality of their own data to reduce the likelihood of mistakes in submissions. Case studies on these institutions may help others improve their game. Guidance on how the data comes together and is translated into performance indicators will also bring some focus on the importance of data quality.

So two strands of activity – making sure the data collected is appropriate, timely and does not increase the burden on institutions and promoting the uses of the data and the need for data quality. Progress is being made but it will take time to deliver on both fronts. There is some overlap with the work of the MIAP programme. It is a measure of how far that particular programme has to go to convince the sector of its value that one of the attendees commented ‘not that that [MIAP] will deliver anything until I am long in retirement’. The aims of MIAP are to improve data efficiency. But thus far, the sector has not been convinced that it will deliver savings or reduce the administrative burden. The proposed pilot studies will be key in demonstrating the business benefits and promoting the programme to the sector.

Research data services – a way forward?

March 2, 2009

Last week I attended a conference to discuss research data services. The focus was on the UKRDS project, one of a number of feasibility studies funded by HEFCE. The existing research data is something of an untapped resource; it is hardly reused by those that create it, let alone by others. And the volume of data (and hence the size of the untapped resource) will grow substantially over the next few years.

The UKRDS project was looking to address a number of issues the biggest of which are probably cultural rather than technical. We heard that researchers are just not used to putting the data they generate into the public domain – it is often poorly documented and lacks quality metadata to allow it to be found and used. Also the Australian National Data Service identified that the cost of contributing data to their data service outweighed the benefits gained from publishing the data. One school of thought was that including citings of data in research metrics would encourage more researchers to deposit data. Another was that more ‘stick’ was required, to make the requirement for data to be published part of the funding conditions and/or Government policy. What was clear was that researchers would need to be trained in how to deliver well documented data that will be of wider use, and that the mechanism for producing the data needs to be low cost. Researchers will also need training in data mining techniques to ensure that they are able to take advantage of the research data held.

There will be technical issues. Tools need to be developed to allow easy access and ideally there needs to be a standardised approach as far as the different disciplines will allow. Resourcing is also an issue, not just in identifying capital funding to deliver a strong pilot but also to deliver a sustainable, scalable solution. The proposed Pathfinder project where a number of institutions will look to build a pilot RDS should clarify some issues and identify a way forward. Regardless of the future direction, a research data service will need a robust infrastructure and ongoing resourcing.