Data Gathering
Output Records
At the time of the pilot, the University did not have either a live publications database or institutional repository and consequently the collection of output records was a manual process.
Information was gathered from a variety of sources. From 2003 onwards, academic staff have all been allocated a web framework known as Personal Professional Pages (PPPs). Amongst other functionality PPPs allow staff to create an online CV which can include publications (of all types) and career details. Usage of the PPPs is very variable across the institution, some are fully up to date, some were completed to meet the requirements of the RAE and some are unused. Where data was available from PPPs it was collected.
Additionally several schools and research groups maintained local publications details. These were often in MS Word or Excel format and were gathered by the project team and used as raw material.
The university also searched the Web of Science (WoS) for entries. Approximately two thirds of the university's submission came from WoS.
Finally some staff used subject based sites such as SPIRES (hosted at Durham University) for high energy physics. Wherever possible these sites were also searched.
Identification of correct records was complicated by several factors including:
- Collaborations with other local research groups such as the 'Plymouth Marine Association' where authorship is not always clearly identified
- The University is in partnership with Exeter University to provide the Peninsular Medical School. Here papers are often jointly authored by members of both universities and credited to the Medical School rather than the universities. Additionally there is a degree of movement from one university to the other which needs to be tracked in order to make correct assignments. The hospitals associated with the medical school also publish papers assigning university staff to the hospital and again these needed to be identified and correctly assigned
- Members of the medical school do not have PPPs
Because the outputs were gathered from a variety of sources there was a large number of duplicates to identify and correct. This was a major issue and the project team found it a difficult and challenging task. This was made more difficult by the fact that the there is not an explicit link between author and affiliation in WoS entries.
Staff
The University changed its HR system in 2006 which was during the survey period for the REF pilot, and, as is common, only a limited amount of data was transferred from the legacy system to the new one. As a result the project team experienced some challenges in compiling a full set of qualifying staff data and had to refer back to the legacy system for some pieces of data. Specifically, members of staff leaving during the survey period but before the new system went live were not recorded in the new system.
Other returns, principally RAE and HESA were also used as information sources and where additional information was identified it was merged manually with the information from the HR system(s).
'Early Career Researcher' was not held in either HR system and had to be gathered manually from a variety of sources. The University believes that the data collected was, in the end, at least eighty percent accurate.
Neither the 'Previous Institution' nor 'Destination Institution' were held systematically. Some staff PPPs provided the data and some UK destinations were held as HESA codes. In both cases the data was incomplete and the data was largely gathered manually. The University is putting mechanisms in place to capture the information systematically for the future.
The University returned very few staff in categories C & D as they were generally visiting professors etc with primary allegiance to other institutions.
Link Records
The linking table was completed manually. The staff table was processed row by row and a matching staff name searched for in the Output table. When a match was found a link record was created. Extra columns were included in the link table for quality assurance purposes which were passed on to Evidence. A second pass was made through the Staff and Output tables for QA purposes and the linkages confirmed.


