The state of UK public sector analysis code: 2023

How to use this research

Responding to CARS is voluntary. The results presented here are from a self-selecting sample of government analysts. Because respondents are self-selecting, the results we present only reflect the views of the analysts who participated. They cannot be used to infer general conclusions about the wider population of government analysts.

For more detail, see the data collection page.

Coding frequency and tools

We asked all respondents “In your current role, how often do you write code to complete your work objectives?”

2023 data

Show chart Show table
In your current role, how often do you write code to complete your work objectives? Percent
Never 13.3%
Rarely 11.7%
Sometimes 19.7%
Regularly 27.4%
All the time 27.9%
Sample size = 1297

Over half of respondents (55%) reported coding regularly or all of the time in their current role.

The relationship between coding frequencies and grade, professional membership and management responsibilities was explored using an ordinal regression model. Only data from respondents in the Civil Service (CS) were included, and CS grades were collapsed into AO/EO, HEO/SEO (including Fast Stream) and G7/G6.

The results showed that analysts at HEO/SEO grade were more likely to code regularly compared to those in the grades above (G7/G6) or below (AO/EO) when controlling for profession and management responsibilities.

63% of HEO/SEO analysts reported coding ‘regularly’ or ‘all of the time’ in their roles, compared to 54% of AO/EOs and 41% of G7/G6s. Coding frequency was increased in respondents who managed coders compared to those without management responsibilities, while those managing non-coders were the least likely of all groups to code frequently.

Of the Civil Service professions, data scientists were more likely to code regularly compared to those not in the profession. Those in the economist (GES) and social research (GSR) professions were less likely to use code in their roles than other analysts. This is likely to be related to differences in analytical tools associated with the professions, see below for more information.

Coding frequency over time

Show chart Show table
In your current role, how often do you write code to complete your work objectives? 2020 2021 2022 2023
Never 15.1% 12% 12.3% 13.3%
Rarely 12.9% 13% 10.6% 11.7%
Sometimes 20.4% 18.4% 18.2% 19.7%
Regularly 29.7% 30.9% 29% 27.4%
All the time 21.9% 25.7% 29.9% 27.9%

Access to and knowledge of programming languages

Given a list of programming tools, we asked all respondents if the tool was available to use for their work.

Access to tools does not necessarily refer to official departmental policy. Some analysts may have access to tools that others cannot access within the same organisation.

The most readily available tools were the open source tools, python (74%), R (92%) and SQL (69%), being available to the majority of respondents. By contrast, fewer than half of respondents reported having access to each of the listed proprietary tools, and the majority of respondents did not know either way.

Access to coding tools

Show chart Show table
Programming tool Yes No Don't know
Python 74.2% 12.6% 13.2%
R 92.1% 4.9% 3%
SQL 68.5% 9% 22.4%
Matlab 5.1% 42.9% 52%
SAS 28.7% 30.3% 41%
SPSS 29% 31.2% 39.8%
Stata 17% 33.4% 49.6%
VBA 40.6% 17.2% 42.2%
Sample size = 1297

Given the same list of programming tools, all respondents were asked if they knew how to program with the tool to a level suitable for their work, answering “Yes”, “No” or “Not required for my work”.

The tools with the highest reported capability were R and SQL, with 64% and 50% of respondents respectively reporting that they were able to use them to do their work. Despite it being widely available, only 37% were able to use python by comparison. Capability in each of the proprietary tools did not exceed 20%, suggesting that they are only used in particular roles.

Note that capability in programming languages is self-reported here and was not objectively defined or tested. The statement “not required for my work” was similarly not defined.

Knowledge of coding tools

Show chart Show table
Programming tool Yes No Not required for my work
Python 36.6% 33.9% 29.5%
R 63.5% 22.3% 14.2%
SQL 50.2% 24.7% 25.1%
Matlab 7.2% 27.1% 65.8%
SAS 16.6% 28.8% 54.7%
SPSS 18.8% 25% 56.2%
Stata 8.4% 28.1% 63.5%
VBA 13.9% 31.8% 54.4%
Sample size = 1297

Open source capability over time

The proportion of respondents who reported having the capability to use R and Python, is shown alongside the proportion able to use SAS, SPSS or Stata, for the past four years of the survey.

Both groups showed a slight drop in 2023, however capability in proprietary tools had the biggest decline, decreasing by 44% from 2022. This is not necessarily representative of a general shift away from proprietary tools as CARS is a self-selecting survey, and so year on year comparisons must be made with caution.

Show chart Show table
Programming language type Year Know how to programme with these tools (percent) Lower confidence limit (percent) Upper confidence limit (percent)
Open Source 2020 69.2% 66.3% 71.9%
Open Source 2021 77% 74.1% 79.6%
Open Source 2022 80.3% 78% 82.3%
Open Source 2023 72.6% 70.1% 74.9%
Proprietary 2020 60.5% 57.5% 63.4%
Proprietary 2021 60.1% 56.9% 63.2%
Proprietary 2022 56.3% 53.6% 58.9%
Proprietary 2023 36.1% 33.5% 38.7%

Professions capability in different tools

Differences in preferred languages may lead to silos between analytical professions. Here we show the percentage of respondents reporting capability in different tools within the different analytical professions.

Note that respondents might be members of more than one profession, and may report capability in more than one tool.

Profession Python R SQL Matlab SAS SPSS Stata VBA
Data engineers 90% 46.7% 86.7% 13.3% 20% 13.3% 3.3% 10%
Data scientists 77.8% 84.7% 79.2% 9.7% 16.7% 11.1% 4.2% 14.6%
Digital and data (DDAT) 63.7% 58.8% 75% 3.8% 15% 5% 0% 22.5%
Economists (GES) 19.4% 58.9% 29.8% 4.8% 9.7% 6.5% 29% 11.3%
Geographers 55% 80% 60% 10% 20% 30% 10% 15%
Operational researchers (GORS) 53.9% 86.5% 62.4% 15.6% 24.1% 9.2% 1.4% 36.2%
Social researchers (GSR) 23.7% 41.4% 19% 0.9% 12.5% 40.5% 9.5% 3.4%
Statisticians (GSG) 35% 75.4% 56.7% 7.8% 21.7% 23.6% 8% 9%

A binomial Generalised Linear Mixed Model (GLMM) was used to explore differences in the use of open source tools (R and python) between the different professions, taking into account that individual may be members of multiple professions.

Those in the data science, operational research (GORS), and statistics (GSG) professions were more likely to be using open source coding tools than those not in these professions. Those in the social research profession (GSR) were less likely to use open source coding tools than other analysts. The remaining professions were found to be not significantly associated with using open source tools, potentially because of increased uncertainty due to the smaller sample sizes from these groups.

Access to and knowledge of git

We asked respondents to answer “Yes”, “No” or “Don’t know” for the following questions:

  • Is git available to use in your work?
  • Do you know how to use git to version-control your work?

These outputs include people who do not code at work.

Access to git

Show chart Show table
Response Percent
Yes 74.3%
No 7%
I don't know 18.7%
Sample size = 1297

Knowledge of git

Show chart Show table
Response Percent
Yes 59%
No 37.5%
I don't know 3.5%
Sample size = 1297

Git was available to 74% of respondents, with only 7% reporting having no access to Git. Capability in Git was lower however. 37% of respondents stated that they do not know how to use Git for version control. This decreased to 32% when excluding non-coders.

Coding capability and change

Where respondents first learned to code

Respondents who had coding experience outside their current role were asked where they first learned to code. Those analysts who code in their current role but reported no other coding experience, are included as having learned ‘In current role’. Those who reported first learning to code outside of a work or educational environment were categorised as ‘self-taught’ based on free-text responses.

These results only show where people first learned to code. They do not show all the settings in which they had learned to code, to what extent, or how long ago.

Show chart Show table
Where learned Percent
Current employment 23.9%
Education 52.2%
Previous private sector employment 4.4%
Previous public sector employment 10.2%
Self-taught 7.3%
Other 2%
Sample size = 1125

The majority of respondents learned to code in education, with the next largest group learning in their current public sector role.

We asked respondents with higher education qualifications to give details of the level, subject and if there was a coding component. Degrees in computer science, mathematics and physical sciences were the most likely to have a coding component, and doctoral degrees were more likely to have a coding component compared to Master’s and Bachelor’s when accounting for individuals studying at multiple levels.

Change in coding ability during current role

We asked “Has your coding ability changed during your current role?”

This question was only asked of respondents with coding experience outside of their current role. This means analysts who first learned to code in their current role are not included in the data.

Show chart Show table
Ability change Percent
Significantly worse 3.6%
Slightly worse 8.3%
Stayed the same 13.7%
Slightly better 35%
Significantly better 39.4%
Sample size = 856

The effect of coding frequency, grade, profession and management responsibilities on changes in coding ability was explored using an ordinal regression model. The model showed a strong positive relationship between coding frequency and increased capability, with those who code ‘all the time’ being the most likely to show improvements.

Although a smaller effect, seniority was also a predictor of capability change. Those at G7/G6 were less likely to have improved their coding skills in their role compared to lower grades. 21% of G7/G6 analysts felt their coding skills had declined while in role, which was higher than both HEO/SEOs (9%) and AO/EOs (4%). Management responsibilities and membership of a profession were not significantly associated with coding ability change.

This demonstrates the importance of regular practice in developing and maintaining coding capability, and suggests that analysts at higher grades face barriers to maintaining their technical skills.

Reproducible analytical pipelines (RAP)

RAP refers to the use of good software engineering practices to make analysis pipelines more reproducible. This approach aims to use automation to improve the quality and efficiency of analytical processes.

The following links contain more resources on RAP:

  • You can find minimum RAP standards in the RAP MVP
  • You can find guidance on quality assuring code in the Duck Book

Awareness of RAP over time

We asked respondents who used coding at work, if they had heard of RAP.

Awareness of RAP appears to be increasing over time. In 2023, 88% of respondents had heard of RAP, the highest proportion to date.

Show chart Show table
Year Heard of RAP (percent) Lower confidence limit Upper confidence limit
2020 68.4% 65.3% 71.4%
2021 75.8% 72.8% 78.7%
2022 82.1% 79.8% 84.2%
2023 88.2% 86.2% 89.9%

RAP Champions

We asked respondents who had heard of RAP, if their department has a RAP champion and if they know who it is.

RAP champions support and promote the use of RAP across government. Please contact the analysis standards and pipelines team for any enquiries about RAP or the champions network.

Show chart Show table
Knowledge Percent
Yes, and I am a RAP Champion 5%
Yes, and I know who the RAP Champion is 31.2%
Yes, but I don't know who the RAP Champion is 15.2%
No 9%
I don't know 39.5%
Sample size = 992

Over half (52%) of respondents were aware their department had a RAP champion, however 40% did not know if their department had a RAP champion, indicating differences in RAP support and promotion between departments.

Awareness of RAP strategy

We asked respondents who had heard of RAP, if they had heard of the RAP strategy.

The Analysis Function RAP strategy was released in June 2022 and sets out plans for adopting RAP across government.

Show chart Show table
Awareness of the RAP strategy Percent
Yes 29.1%
Yes, but I haven't read it 41%
No 29.8%
Sample size = 992

Over half (70%) of respondents were aware of the RAP strategy, but of these 58% hadn’t read it. Further analysis showed that membership of a profession was positively associated with awareness of the RAP strategy, but did not affect whether analysts had read it, while grade had no effect.

Opinions on RAP

We asked respondents who had heard of RAP whether they agreed with a series of statements.

Show chart Show table
Statement Strongly Disagree (%) Disagree (%) Neutral (%) Agree (%) Strongly Agree (%)
I and/or my team are currently implementing RAP 15.7% 21.4% 18.9% 26% 18%
I feel confident implementing RAP in my work 9.2% 19.6% 21.1% 33.2% 17%
I feel supported to implement RAP in my work 8.1% 15% 24.7% 35.1% 17.1%
I know where to find resources to help me implement RAP 7.5% 17.6% 19.2% 37.1% 18.6%
I or my team are planning on implementing RAP in the next 12 months 8.3% 11.9% 24.5% 32.4% 23%
I think it is important to implement RAP in my work 3.7% 5.6% 16.6% 37.1% 36.9%
I understand what the key components of the RAP methodology are 7.2% 17.5% 16.4% 38.9% 20%
Sample size = 992

The results showed that 74% of respondents agreed that it is important to implement RAP in their work. However only 44% of respondents said that they were doing so, with 55% planning to implement RAP in the next 12 months. Discrepancies between support for and implementation of RAP by analysts suggests that there are barriers to implementation that still need to be overcome.

The responses were converted into a composite score after checking for consistency across the statements, and ANOVA was used to test if there was an effect of grade or profession membership on respondents’ opinions of RAP. Respondents who were members of a profession were generally more positive in their opinions of RAP, while grade had no effect. Professions can play an important role in promoting best practice such as RAP within analytical communities.

Good coding practices

We asked respondents who reported writing code at work about the good practices they apply when writing code at work.

These questions cover many of the coding practices recommended in the quality assurance of code for analysis and research guidance, as well as the minimum RAP standards set by the cross-government RAP champions network.

Coding practices have been classified as either ‘Basic’ or ‘Advanced’. Basic practices are those that make up the minimum RAP standards, while Advanced practices help improve reproducibility. The percentage of respondents who reported applying these practices either ‘Regularly’ or ‘All the time’ is shown below.

Open sourcing was defined as ‘making code freely available to be modified and redistributed’.

Show chart Show table
RAP component Type Percentage of analysts who code in their work
Use open source software Basic 71.2%
Proportionate QA Basic 65.6%
Peer review Basic 53.6%
Version control Basic 44.7%
Documentation Basic 29.4%
Team open source code Basic 12.5%
Functions Advanced 55.9%
Follow code style guidelines Advanced 49.3%
Function documentation Advanced 31.6%
Dependency management Advanced 29.4%
Unit testing Advanced 22.1%
Continuous integration Advanced 17.4%
Code packages Advanced 11.9%
Sample size = 1125

The most commonly used practices were the use of open source software (71%) and proportionate QA (66%). Open sourcing code (13%) and packaging code (12%) were the practices least commonly applied.

Consistency of good coding practices

We asked respondents who reported writing code at work how frequently they apply good coding practices when writing code at work.

Show chart Show table
Statement I don't understand this question (%) Never (%) Rarely (%) Sometimes (%) Regularly (%) All the time (%)
Automated data quality assurance 6.6% 20.1% 15.5% 26.9% 18.9% 12%
Code review 1.1% 7.8% 10.3% 27.2% 29.4% 24.2%
Coding guidelines / Style guides 6.8% 11.4% 9.4% 23.1% 30.5% 18.8%
Functions 5.2% 8.6% 10.1% 20.2% 28.4% 27.6%
Open source own code 17.6% 45.1% 13.1% 11.7% 6.8% 5.7%
Packaging code 12.4% 49.7% 14.3% 11.6% 6.9% 5%
Proportionate quality assurance 11.2% 6.4% 3.4% 13.4% 36% 29.6%
Quality assurance throughout development 6.8% 8.1% 6.8% 18% 35.5% 24.9%
Standard directory structure 25.5% 16.7% 8.6% 15.7% 19.1% 14.3%
Unit testing 25.8% 20% 13.8% 18.3% 13.7% 8.4%
Use open source software 1.8% 6.1% 8.4% 12.5% 23.8% 47.4%
Version control 4.8% 27.7% 9.5% 13.2% 18.3% 26.4%
Sample size = 1125

The results show that while good coding practice is being used in analytical code, it is not applied consistently. For example 42% of respondents reported automating quality assurance only ‘Sometimes’ or ‘Rarely’. Similarly, a quarter of respondents were unfamiliar with standard directory structures (26%) and unit tests (26%), suggesting that skills gaps or terminology barriers remain in some areas.

Code documentation

We asked respondents who reported writing code at work how frequently they write different forms of documentation when programming in their current role.

Embedded documentation is one of the components which make up the RAP minimum viable product. Documentation is important to help others be clear on how to use the product and what the code is intended to do.

Show chart Show table
Statement I don't understand this question (%) Never (%) Rarely (%) Sometimes (%) Regularly (%) All the time (%)
Analytical Quality Assurance (AQA) logs 19.6% 30.8% 12.2% 18.9% 12.4% 6%
Code comments 2.3% 4.2% 1.8% 8.3% 27.8% 55.6%
Data or assumptions registers 20.1% 36% 11.7% 13.1% 10.9% 8.2%
Desk notes 20.3% 19.5% 10.2% 20.5% 19.6% 10%
Documentation for each function or class 9.8% 24.4% 14.5% 19.6% 17.8% 13.9%
Flow charts 6.3% 28.2% 20.3% 26.8% 12.9% 5.6%
README files 5.8% 26.3% 13.7% 23.5% 17.1% 13.7%
Sample size = 1125

In line with previous years, code comments remain the most common form of documentation, with 83% of respondents using them regularly or all of the time. All other forms of documentation were used much less consistently suggesting that documentation is not being prioritised in analytical work. This has implications for reproducibility and long term sustainability of projects.

Summary

The RAP strategy has three goals:

  • to ensure analysts have the right tools to implement RAP principles
  • to ensure analysts are supported to implement RAP principles
  • to ensure there is a culture of RAP principles by default for analysis

The results from CARS 2023 indicate that public sector analysts are regularly using coding in their work and that open source tools are widely accessible. There is still progress to be made against the first goal however, particularly in ensuring Git is available to all public sector analysts. There also remains some discrepancy between availability and capability in different tools, meaning that even if open source tools are available they are not necessarily being used.

Over half of respondents agreed that they felt supported implemented RAP principles and that they knew where to find resources to help. The data suggests that coding capability is increasing on the whole, however there are still some areas in which it is negatively affected. This is likely due to lack of opportunity to maintain coding skills in role, as coding frequency is strongly positively correlated with increasing capability. To maintain and improve the coding skills of analysts, it is important to allow time for active development as well as providing training resources.

Although awareness of RAP appears to be increasing, there remains some inconsistency in the application of RAP principles. This suggests that coding best practice is not always applied by default. While this can be attributed to skills gaps in some instances, in others there may be specific project, role or departmental factors that mean RAP is being under-used.