Dr Paul Richmond

Dr Paul Richmond

Research Software Engineer

Journal Publications

  • "PI-FLAME: A parallel immune system simulator using the FLAME graphic processing unit environment"

    S. Tamrakar, P. Richmond, R. M. D’Souza (2016)

    Simulation: Transactions of the Society for Modeling and Simulation International (to appear)

    link pdf

    Agent-based models (ABMs) are increasingly being used to study population dynamics in complex systems, such as the human immune system. Previously, Folcik et al. (The basic immune simulator: an agent-based model to study the interactions between innate and adaptive immunity. Theor Biol Med Model 2007; 4: 39) developed a Basic Immune Simulator (BIS) and implemented it using the Recursive Porous Agent Simulation Toolkit (RePast) ABM simulation framework. However, frameworks such as RePast are designed to execute serially on central processing units and therefore cannot efficiently handle large model sizes. In this paper, we report on our implementation of the BIS using FLAME GPU, a parallel computing ABM simulator designed to execute on graphics processing units. To benchmark our implementation, we simulate the response of the immune system to a viral infection of generic tissue cells. We compared our results with those obtained from the original RePast implementation for statistical accuracy. We observe that our implementation has a 13 x performance advantage over the original RePast implementation.

  • "SpineCreator: a Graphical User Interface for the Creation of Layered Neural Models"

    A. J. Cope, P. Richmond, S. S. James, K. Gurney, D. J. Allerton (2016)

    Neuroinform (2016). doi:10.1007/s12021-016-9311-z

    link pdf

    There is a growing requirement in computational neuroscience for tools that permit collaborative model building, model sharing, combining existing models into a larger system (multi-scale model integration), and are able to simulate models using a variety of simulation engines and hardware platforms. Layered XML model specification formats solve many of these problems, however they are difficult to write and visualise without tools. Here we describe a new graphical software tool, SpineCreator, which facilitates the creation and visualisation of layered models of point spiking neurons or rate coded neurons without requiring the need for programming. We demonstrate the tool through the reproduction and visualisation of published models and show simulation results using code generation interfaced directly into SpineCreator. As a unique application for the graphical creation of neural networks, SpineCreator represents an important step forward for neuronal modelling.

  • "Osteolytica: An automated image analysis software package that rapidly measures cancer-induced osteolytic lesions in in vivo models with greater reproducibility compared to other commonly used methods."

    Evans HR, Karmakharm T, Lawson MA, Walker RE, Harris W, Fellows C, Huggins ID, Richmond P, Chantry AD (2015)

    Bone. 2015 Oct 8;83:9-16. doi: 10.1016/j.bone.2015.10.004.

    link pdf

    Methods currently used to analyse osteolytic lesions caused by malignancies such as multiple myeloma and metastatic breast cancer vary from basic 2-D X-ray analysis to 2-D images of micro-CT datasets analysed with non-specialised image software such as ImageJ. However, these methods have significant limitations. They do not capture 3-D data, they are time-consuming and they often suffer from inter-user variability. We therefore sought to develop a rapid and reproducible method to analyse 3-D osteolytic lesions in mice with cancer-induced bone disease. To this end, we have developed Osteolytica, an image analysis software method featuring an easy to use, step-by-step interface to measure lytic bone lesions. Osteolytica utilises novel graphics card acceleration (parallel computing) and 3-D rendering to provide rapid reconstruction and analysis of osteolytic lesions. To evaluate the use of Osteolytica we analysed tibial micro-CT datasets from murine models of cancer-induced bone disease and compared the results to those obtained using a standard ImageJ analysis method. Firstly, to assess inter-user variability we deployed four independent researchers to analyse tibial datasets from the U266-NSG murine model of myeloma. Using ImageJ, inter-user variability between the bones was substantial (±19.6%), in contrast to using Osteolytica, which demonstrated minimal variability (±0.5%). Secondly, tibial datasets from U266-bearing NSG mice or BALB/c mice injected with the metastatic breast cancer cell line 4T1 were compared to tibial datasets from aged and sex-matched non-tumour control mice. Analyses by both Osteolytica and ImageJ showed significant increases in bone lesion area in tumour-bearing mice compared to control mice. These results confirm that Osteolytica performs as well as the current 2-D ImageJ osteolytic lesion analysis method. However, Osteolytica is advantageous in that it analyses over the entirety of the bone volume (as opposed to selected 2-D images), it is a more rapid method and it has less user variability.

  • "From Model Specification to Simulation of Biologically Constrained Networks of Spiking Neurons"

    Richmond P, Cope A, Gurney K, Allerton DJ. (2014)

    Neuroinformatics, April 2014, Volume 12, Issue 2, pp 307-323

    link pdf

    A declarative extensible markup language (SpineML) for describing the dynamics, network and experiments of large-scale spiking neural network simulations is described which builds upon the NineML standard. It utilises a level of abstraction which targets point neuron representation but addresses the limitations of existing tools by allowing arbitrary dynamics to be expressed. The use of XML promotes model sharing, is human readable and allows collaborative working. The syntax uses a high-level self explanatory format which allows straight forward code generation or translation of a model description to a native simulator format. This paper demonstrates the use of code generation in order to translate, simulate and reproduce the results of a benchmark model across a range of simulators. The flexibility of the SpineML syntax is highlighted by reproducing a pre-existing, biologically constrained model of a neural microcircuit (the striatum). The SpineML code is open source and is available at http://​bimpa.​group.​shef.​ac.​uk/​SpineML.

  • "Democratic Population Decisions Result in Robust Policy-Gradient Learning: A Parametric Study with GPU Simulations"

    Richmond P, Buesing L, Giugliano M, Vasilaki E (2011)

    PLoS ONE 6(5): e18539. doi:10.1371/journal.pone.0018539

    link pdf

    High performance computing on the Graphics Processing Unit (GPU) is an emerging field driven by the promise of high computational power at a low cost. However, GPU programming is a non-trivial task and moreover architectural limitations raise the question of whether investing effort in this direction may be worthwhile. In this work, we use GPU programming to simulate a two-layer network of Integrate-and-Fire neurons with varying degrees of recurrent connectivity and investigate its ability to learn a simplified navigation task using a policy-gradient learning rule stemming from Reinforcement Learning. The purpose of this paper is twofold. First, we want to support the use of GPUs in the field of Computational Neuroscience. Second, using GPU computing power, we investigate the conditions under which the said architecture and learning rule demonstrate best performance. Our work indicates that networks featuring strong Mexican-Hat-shaped recurrent connections in the top layer, where decision making is governed by the formation of a stable activity bump in the neural population (a “non-democratic” mechanism), achieve mediocre learning results at best. In absence of recurrent connections, where all neurons “vote” independently (“democratic”) for a decision via population vector readout, the task is generally learned better and more robustly. Our study would have been extremely difficult on a desktop computer without the use of GPU programming. We present the routines developed for this purpose and show that a speed improvement of 5x up to 42x is provided versus optimised Python code. The higher speed is achieved when we exploit the parallelism of the GPU in the search of learning parameters. This suggests that efficient GPU programming can significantly reduce the time needed for simulating networks of spiking neurons, particularly when multiple parameter configurations are investigated.

  • "High Performance Cellular Level Agent-based Simulation with FLAME for the GPU"

    Richmond P, Walker D, Coakley S, Romano D (2010)

    Briefings in Bioinformatics 2010, May 2010; 11(3), pages. 334-47

    link pdf

    Driven by the availability of experimental data and ability to simulate a biological scale which is of immediate interest, the cellular scale is fast emerging as an ideal candidate for middle-out modelling. As with ‘bottom-up’ simulation approaches, cellular level simulations demand a high degree of computational power, which in large-scale simulations can only be achieved through parallel computing. The flexible large-scale agent modelling environment (FLAME) is a template driven framework for agent-based modelling (ABM) on parallel architectures ideally suited to the simulation of cellular systems. It is available for both high performance computing clusters (www.flame.ac.uk) and GPU hardware (www.flamegpu.com) and uses a formal specification technique that acts as a universal modelling format. This not only creates an abstraction from the underlying hardware architectures, but avoids the steep learning curve associated with programming them. In benchmarking tests and simulations of advanced cellular systems, FLAME GPU has reported massive improvement in performance over more traditional ABM frameworks. This allows the time spent in the development and testing stages of modelling to be drastically reduced and creates the possibility of real-time visualisation for simple visual face-validation.

Peer Reviewed Conference Proceedings (and extended abstracts)

  • "From GUI to GPU: A toolchain for GPU code generation for large scale Drosophila simulations using SpineML (CNS Extended Abstract)"

    Adam Tomkins, Carlos Luna Ortiz, Daniel Coca and Paul Richmond (2016)

    Front. Neuroinform. Conference Abstract: Neuroinformatics 2016. doi: 10.3389/conf.fninf.2016.20.00049


  • "Feasibility Study of Multi-Agent Simulation at the Cellular Level with FLAME GPU"

    de Paiva Oliveira A, Richmond P (2016)

    Proceeding of the Twenty-Ninth International Flairs Conference, pages 398-403

    link pdf

    Multi-Agent Systems (MAS) are a common approach to simulating biological systems. Multi-agent modelling provides a natural method for describing individual level behaviours of cells. However, the computation cost of simulating behaviours at an individual level is considerably larger than top down equation based modelling approaches. A recent possibility to improve computational performance is the use of Graphics Processing Units (GPUs) to provide the necessary parallel computing power. In this paper we show that multi-agent models describing biological systems at cellular level are well suited to GPU acceleration. Cellular level systems are characterised by vast numbers of agents that intensively communicate, indirectly through diffusion of chemical substances, or directly, through connection of chemical receptors. We present a study which utilises the FLAME GPU software to target a MAS model of a generic pathogen induced infection to validate the suitability of the GPU for simulation of a broader class of cellular level systems.

  • "Large-Scale Simulations with FLAME"

    Coakley S, Richmond P, Gheorghe M, Chin S, Worth D, Holcombe M, Greenough C (2016)

    Intelligent Agents in Data-intensive Computing, Studies in Big Data Series, ISBN 978-3-319-23742-8, pages 123-142

    link pdf

    This chapter presents the latest stage of the FLAME development - the high-performance environment FLAME-II and the parallel architecture designed for Graphics Processing Units, FLAMEGPU. The architecture and the performances of these two agent-based software environments are presented, together with illustrative large-scale simulations for systems from biology, economy, psychology and crowd behaviour applications.

  • "Road Network Simulation using FLAME GPU"

    Heywood P, Richmond P, Maddock S (2015)

    Euro-Par 2015: Parallel Processing Workshops, Volume 9523 of the series Lecture Notes in Computer Science pp 430-441

    link pdf

    Demand for high performance road network simulation is increasing due to the need for improved traffic management to cope with the globally increasing number of road vehicles and the poor capacity utilisation of existing infrastructure. This paper demonstrates FLAME GPU as a suitable Agent Based Simulation environment for road network simulations, capable of coping with the increasing demands on road network simulation. Gipps’ car following model is implemented and used to demonstrate the performance of simulation as the problem size is scaled. The performance of message communication techniques has been evaluated to give insight into the impact of runtime generated data structures to improve agent communication performance. A custom visualisation is demonstrated for FLAME GPU simulations and the techniques used are described.

  • "Visualising Real Time Large Scale Micro-Simulation of Transport Networks"

    Heywood P, Richmond P, Maddock S (2015)

    Proc. Computer Graphics & Visual Computing (CGVC) 2015, Wednesday, 16-17 September, 2015, University College London, UK (extended abstract)

  • "Complex system simulations on the GPU"

    Richmond P, Heywood P (2015)

    in Processedings of EMIT 2015 ISBN

    link pdf

    Simulation of complex systems provides a computational challenge due to the amount of computational performance required to simulate all individuals within a large population. The Graphics Processing Unit (GPU) presents a potential solution. Using the CUDA programming language GPUs can be programmed for general purpose use. Unfortunately the translation of a complex systems model to CUDA code is a non-trivial task requiring specialist knowledge of the architecture to obtain good performance. This paper reviews FLAME GPU a framework which provides transparent mapping and simulation of complex systems models to a CUDA enabled GPU. The framework has considerable performance benefits over its CPU counterpart and provides real-time visualisation for interactive steering of simulations in domains as diverse as computational biology and pedestrian dynamics.

  • "The SpineML toolchain: enabling computational neuroscience through flexible tools for creating, sharing, and simulating neural models (extended abstract)"

    Alexander J Cope, Paul Richmond, and Dave Allerton (2014)

    BMC Neuroscience. 2014;15(Suppl 1):P224. doi:10.1186/1471-2202-15-S1-P224.


  • "Resolving conflicts between multiple competing agents in parallel simulations"

    Richmond P (2014)

    Euro-Par 2014: Parallel Processing Workshops, Volume 8805 of the series Lecture Notes in Computer Science pp 371-382

    link pdf

    Agents within multi-agent simulation environments frequently compete for limited resources, requiring negotiation to resolve ‘conflict’. The negotiation process for resolving conflict often relies on a transactional or serial processes that complicates implementation within a parallel simulation framework. This paper demonstrates how transactional events to resolve competition can be implemented within a parallel simulation framework (FLAME GPU) as a series of iterative parallel agent functions. A sugarscape model where agents compete for space and a model requiring optimal assignment between two populations, the stable marriage problem, are demonstrated. The two case studies act as a building block for more general conflict resolution behaviours requiring negotiation between agents in a parallel simulation environment. Extensions to the FLAME GPU framework are described and performance results are provided to show scalability of the case studies on recent GPU hardware.

  • "High Performance Agent-based Simulation"

    Richmond P, Karmakharm T, Coakley S (2012)

    in Proc. of Royal Aeronautical Society (RAeS) Autumn Flight Simulation Conference (Flight Simulation Research – New Frontiers), 28-29th November


  • "Path Tracing on Massively Parallel Neuromorphic Hardware"

    Richmond P, Allerton DJ (2012)

    in proceedings of Theory and Practice of Computer Graphics UK (TPCG) 2012 pages 25-28.

    link pdf

    Ray tracing on parallel hardware has recently benefit from significant advances in the graphics hardware and associated software tools. Despite this, the SIMD nature of graphics card architectures is only able to perform well on groups of coherent rays which exhibit little in the way of divergence. This paper presents SpiNNaker, a massively parallel system based on low power ARM cores, as an architecture suitable for ray tracing applications. The asynchronous design allows us to demonstrate a linear performance increase with respect to the number of cores. The performance perWatt ratio achieved within the fixed point path tracing example presented is far greater than that of a multi-core CPU and similar to that of a GPU under optimal conditions.

  • "Large Scale Pedestrian Multi-Simulation for a Decision Support Tool"

    Karmakharm T, Richmond P (2012)

    in proceedings of Theory and Practice of Computer Graphics UK (TPCG) 2012 pages 41-44.

    link pdf

    Ability to simulate pedestrian behaviour on a large scale is essential in identifying potential dangers in public spaces during an evacuation. Multiple designs must be tested with varying parameters and run multiple times to achieve statistical significance due to the model's stochastic nature. In this short paper, we describe our prototype decision support tool that enables concurrent simulation on GPU-enabled computers by merging them to increase efficiency and dispatching simulation jobs across multiple machines on the network. Preliminary results with our GPU-optimised model have been shown to run at faster than real-time simulation speeds.

  • "FLAME GPU Technical Report and User Guide"

    Richmond P (2011)

    University of Sheffield, Department of Computer Science Technical Report CS-11-03


  • "Template driven agent based modelling and simulation with CUDA"

    Richmond P, Romano D (2011)

    GPU Computing Gems Emerald Editor, Wen-mei Hwu (Editor), Morgan Kaufmann, January 2011, ISBN: 978-0-12-384988-5, pages 313-324.

    link pdf

    This chapter describes a number of key techniques that are used to implement a flexible agent-based modeling (ABM) framework entirely on the GPU in CUDA. Performance rates equaling or bettering that of high-performance computing (HPC) clusters can easily be achieved, with obvious cost-toperformance benefits. Massive population sizes can be simulated, far exceeding those that can be computed (in reasonable time constraints) within traditional ABM toolkits. The use of data parallel methods ensures that the techniques used within this chapter are applicable to emerging multicore and data parallel architectures that will continue to increase their level of parallelism to improve performance. The concept of a flexible architecture is built around the use of a neutral modeling language (XML) for agents. The technique of template-driven dynamic code generation specifically using XML template processing is also general enough to be effective in other domains seeking to solve the issue of portability and abstraction of modeling logic from simulation code.

  • "Audio-Visual Animation of Urban Space"

    Richmond P, Smyronova Y, Maddock S, Kang J (2010)

    In Proc. of Theory and Practice of Computer Graphics (TPCG) 2010 pages 189-190.

    link pdf

    We present a technique for simulating accurate physically modelled acoustics within an outdoor urban environment and a tool that presents the acoustics alongside a visually rendered counterpart. Acoustic modelling is achieved by using a mixture of simulating ray-traced specular sound wave reflections and applying radiosity to simulate diffuse reflections. Sound rendering is applied to the energy response of the acoustic modelling stage and is used to produce a number of binaural samples for playback with headphones. The visual tool which has been created unites the acoustic renderings with an accurate 3D representation of the virtual environment. As part of this tool an interpolation technique has been implemented allowing a user controlled walkthrough of the simulated environment. This produces better sound localisation effects than listening from a set number of static locations.

  • "Integration of Acoustic Simulation with Interactive Visual Animation of Urban Environment"

    Kang J, Smyronova Y, Richmond P, Maddock S (2010)

    Invited Paper, Proc. of EAA EUROREGIO 2010 Summer School, 15-18 September 2010, Ljubljana, Slovenia

  • "Agent-based Large Scale Simulation of Pedestrians With Adaptive Realistic Navigation Vector Field"

    Karmakharm T, Richmond P, Romano D (2010)

    In Proc. of Theory and Practice of Computer Graphics (TPCG) 2010 pages 67-74.

    link pdf

    A large scale pedestrian simulation method, implemented with an agent based modelling paradigm, is presented within this paper. It allows rapid prototyping and real-time modifications, suitable for quick generation and testing of the viability of pedestrian movement in urban environments. The techniques described for pedestrian simulation make use of parallel processing through graphics card hardware allowing simulation scales to far exceed those of serial frameworks for agent based modelling. The simulation has been evaluated through benchmarking of the performances manipulating population size, navigation grid, and averaged simulation steps. The results demonstrate that this is a robust and scalable method for implementing pedestrian navigation behaviour. Furthermore an algorithm for generating smooth and realistic pedestrian navigation paths that works well in both small and large spaces is presented. An adaptive smoothing function has been utilised to optimise the path used by pedestrian agents to navigate around in a complex dynamic environment. Optimised and un-optimised vectors maps obtained by applying or not such function are compared, and the results show that the optimized path generates a more realistic flow.

  • "FLAME simulating Large Populations of Agents on Parallel Platforms"

    Kiran M, Richmond P, Holcombe M, Chin Lee S, Worth D, Greenough C (2010)

    Proc. of 9th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2010), van der Hock, Kaminka, Lesperance, Luck and Sen (eds.) May 10-14, 2010, Toronto Canada, pages 1633-1636.

    link pdf

    High performance computing is essential for simulating complex problems using agent-based modelling (ABM). Researchers are hindered by complexities of porting models on parallel platforms and time taken to run large simulations on a single machine. This paper presents FLAME framework, the only supercomputing framework which automatically produces parallelisable code to execute on different parallel hardware architectures. FLAME’s inherent parallelism allows large number of agents to be simulated in less time than comparable simulation frameworks. The framework also handles the parallelisation of model code allowing modellers to run simulations on number of supported architectures. The framework has been well tested in various disciplines like biology and economics projects yielding successful research results like project EURACE, where the European economy was modelled using agents. More recently FLAME has been ported to consumer NVIDIA Graphics Processing Units (GPUs) allowing parallel performance equal to that of grid architectures with the ability to perform real time visualisation.

  • "Cellular Level Agent Based Modelling on the Graphics Processing Unit"

    Richmond P, Coakley S, Romano D (2009)

    Proc. of HiBi09 - High Performance Computational Systems Biology, 14-16 October 2009,Trento, Italy, pages 43-50

    link pdf

    Cellular level agent based modelling is reliant on either sequential processing environments or expensive and largely unavailable PC grids. The GPU offers an alternative architecture for such systems, however the steep learning curve associated with the GPUs data parallel architecture has previously limited the uptake of this emerging technology. In this paper we demonstrate a template driven agent architecture which provides a mapping of XML model specifications and C language scripting to optimised Compute Unified Device Architecture (CUDA) for the GPU. Our work is validated though the implementation of a Keratinocyte model using limited range message communication with non linear time simulation steps to resolve inter cellular forces. The performance gain achieved over existing modelling techniques reduces simulation times from hours to seconds. The improvement of simulation performance allows us to present a real time visualisation technique which was previously unobtainable.

  • "NARCSim An Agent-Based Illegal Drug Market Simulation"

    Romano D, Lomax L, Richmond P (2009)

    in proceedings of Games Innovations Conference, 2009. ICE-GIC 2009. International IEEE Consumer Electronics Societys, pages 101-108

    link pdf

    Combined forces service interventions in the UK illegal drug market can be designed and evaluated using a serious game, where the illegal drug market can be simulated using an agent-based model with a large number of different classes of human behaviour. This paper presents NARCSim the Intelligent-ABM that used to power the illegal drug market serious game under construction. NARCSim is an Agent-Based Social Simulation of Heroin, Cannabis, Cocaine users and dealers, police and treatment officers' behaviour in UK. The agents' behaviour has been formalised using X-machines and implemented on the agent-based framework FLAME.

  • "A High Performance Agent Based Modelling Framework on Graphics Card Hardware with CUDA"

    Richmond P, Coakley S, Romano D (2009)

    in proceedings of AAMAS '09 Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2, pages 1125-1126

    link pdf

    We present an efficient implementation of a high performance parallel framework for Agent Based Modelling (ABM), exploiting the parallel architecture of the Graphics Processing Unit (GPU). It provides a mapping between formal agent specifications, with C based scripting, and optimised NVIDIA Compute Unified Device Architecture (CUDA) code. The mapping of agent data structures and agent communication is described, and our work is evaluated through a number of simple interacting agent examples. In contrast with an alternative, single machine CPU implementation, a speedup of up to 250 times is reported.

  • "A High Performance Framework For Agent Based Pedestrian Dynamics On GPU Hardware"

    Richmond P, Romano D (2008)

    Proceedings of EUROSIS ESM 2008 (European Simulation and Modelling), October 27-29, 2008, Universite du Havre, Le Havre, France


    Pedestrian simulations have recently focused on top-down implementations ignoring more computationally intensive agent based dynamics. This paper presents a framework for agent based modelling on the Graphics Processing Unit (GPU) which demonstrates large scale pedestrian simulation and rendering. GPU hardware offers significant performance for dynamic large crowd simulations, however the process of mapping computational tasks to the GPU is not trivial and expert knowledge is required. An agent based specification technique is therefore presented, which allows the underlying GPU data storage and agent communication to be hidden. The framework allows the use of static maps to set static environment obstacles and a zoning technique is described for route planning. Parallel population feedback routines are also used to implement Level of Detail (LOD) rendering which avoids any costly CPU data read-back.

  • "Agent Based GPU, a Real-time 3D Simulation and Interactive Visualisation Framework for Massive Agent Based Modelling on the GPU"

    Richmond P, Romano D (2008)

    Proceedings of International Workshop on Supervisualisation 2008

    link pdf

    Traditional Agent Based Modelling (ABM) applications and frameworks lack the close coupling between the simulation behaviour and its visualisation that is required to achieve real time interactive performance with populations above a couple of thousand. The Graphics Processing Unit (GPU) offers an ideal solution to simulate and visualise the behaviour of high population ABM. The parallel nature of processing offers significant and scalable performance increases, with the added benefit of avoiding data transfer between the simulation and rendering stages. In this paper we demonstrate a framework for real-time simulation and visualisation of massive Agent Based modelling on the GPU (ABGPU).

  • "Automatic Generation of Residential Areas using GeoDemographics"

    Richmond P, Romano D (2007)

    Advances in 3D Geoinformation Systems, Part of the series Lecture Notes in Geoinformation and Cartography, pages 401-416

    link pdf

    The neighbourhood aspect of city models is often overlooked in methods of generating detailed city models. This paper identifies two distinct styles of virtual city generation and highlights the weaknesses and strengths of both, before proposing a geo-demographically based solution to automatically generate 3D residential neighbourhood models suitable for use within simulative training. The algorithms main body of work focuses on a classification based system which applies a texture library of captured building instances to extruded and optimised virtual buildings created from 2D GIS data.