GTCP

1. Code Name: GTC-P
2. Code Category: GK-PIC
3. Primary Developer: Stephane Ethier
4. Other Developers and Users: Mark Adams
5. Short description (one line if possible): 2D domain decomposition version of the GTC global gyrokinetic PIC code for studying microturbulent core transport.
6. Computer Language (Fortran77, Fortran90, C, C++, etc) and approx # of lines : 15,000 lines of Fortran90, and a few C routines. Also contains triangle.c (16,000 lines) to construct a triangular mesh for the alternate finite element solver. Total = 31,000 lines.
7. Type of input required (including files and/or output from other codes). Is there any special input preparation system (eg, GUI): small text file containing run parameters in a Fortran namelist format.
8. Type of output produced (including files that are read by other codes and sizes of large files and synthetic diagnostics): Time-dependant volume-averaged and surface-averaged data (fluxes, modes, etc.) in text format; snapshots of spatially varying data and distribution functions, also in text format; 3D potential and density fluctuation data in native binary format; checkpoint-restart files of all particle data and fields written at some time interval and output in ADIOS bp format (ORNL high-performance I/O library).
9. Describe any post-processors which read the output files: In-house IDL analysis tool, MATLAB scripts, and specialized Fortran analysis routines.
10. Status and location of code input/output documentation: None
11. Code web site? Under construction
12. Is code under version control? What system? Is automated regression testing performed? Code under SVN version control system. No automatic regression testing.
13. One to two paragraph description of equations solved and functionality including what discretizations are used in space and time:
  • Solves the global, nonlinear gyrokinetic equation using the particle-in-cell method. The Poisson equation is solved by using the PETSc library (Jacobi pre-conditioner + Conjugate Gradient solver). The time advance of the particles is a implemented with a second-order Runge Kutta algorithm.
  • Large-aspect ratio approximation with circular cross-section is hard-coded in the source code. Only adiabatic electrons and single species at the moment.
14. What modes of operation of code are there (eg: linear, nonlinear, reduced models, etc ): Fully nonlinear, first principle gyrokinetic code.
15. Journal references describing code:
16. Codes it is similar to and differences (public version): Has the same physics as the benchmarking version of GTC (gtc_bench) used by NERSC for their procurements but differs in its parallel algorithm, which is a 2D domain decomposition, toroidal and radial, instead of 1D toroidal decomposition. This allows GTC-P to scale for both the particles and the grid, and greatly reduces the memory footprint per MPI process. GTC-P also has an object-oriented layer on top of the original routines and also uses PETSc for solving the Poisson equation.
17. Results of code verification and convergence studies (with references):
  • S. Ethier, M. Adams, J. Carter, L. Oliker, "Petascale Parallelization of the Gyrokinetic Toroidal Code", In proceedings of VECPAR'10 9th International Meeting on High Performance Computing for Computational Science, Berkeley, CA, June 23, 2010 (http://vecpar.fe.up.pt/2010/).
  • M.F. Adams, S. Ethier, and N. Wichmann, "Performance of particle in cell methods on highly concurrent computational architectures", J. of Physics: Conf. Series 78, 012001 (2007).
18. Present and recent applications and validation exercises (with references as available)
19. Limitations of code parameter regime (dimensionless parameters accessible):
  • Only circular cross-section in the large-aspect ratio approximation.
  • Only adiabatic electrons at the moment.
  • Single ion species, core turbulence.
20. What third party software is used? (eg. Meshing software, PETSc, ...): PETSc, ADIOS
21. Description of scalability: Excellent weak scaling of both grid and particle past 100,000 cores.
(figure ?)
22. Major serial and parallel bottlenecks.: The PIC gather and scatter steps are the main impediments to achieving peak performance. Some load imbalance for very large systems (> ITER).
23. Are there smaller codes contained in the larger code? No.
24. Supported platforms and portability: All major large-scale systems (Cray XT, Blue Gene, etc.)
25. Illustrations of time-to-solution on different platforms and for different complexity of physics, if applicable.