1. Code Name: M3D-K
2. Code Category: Transport, Equilibrium, MHD, GK, Neutrals, RF (etc) 3D Nonlinear Kinetic/MHD hybrid code
3. Primary Developer: Guoyong Fu
4. Other Developers and Users Developer/users: Jianying Lang, Josh Breslau Other users: Feng Wang
5. Short description (one line if possible): M3D-K is a massively parallel kinetic/MHD hybrid code that computes the initial value solutions of the kinetic/MHD equations in toroidal geometry with thermal plasma treated as a single fluid and the energetic particles treated as drift-kinetic or gyrokinetic particles using PIC method. The effects of energetic particles enter via the stress tensor term in the momentum equation. M3D-K is a subset of the multi-level extended MHD 3D code M3D. This document describes mainly the PIC part of M3D. Please refer to the corresponding document of M3D for other details.
6. Computer Language (Fortran77, Fortran90, C, C++, etc) and approx # of lines: Mixture of C (mesh generation, finite element operators, PETSc interface) and Fortran 90 (physics driver) routines. Approximately 55,000 lines of C and 61,000 lines of Fortran code.
7. Type of input required (including files and/or output from other codes). Is there any special input preparation system (e.g., GUI): Requires an input for specifying MHD equilibrium. Also requires a "wxy" file containing input namelists, and optionally a "config.dat" file containing mesh dimensions and domain decomposition. The PIC part requires input for specifying the initial equilibrium distribution of energetic particles.
8. Type of output produced (including files that are read by other codes and sizes of large files and synthetic diagnostics):
For the fluid part, the primary output mechanism is HDF5 file 3d.001.h5, containing 3D field data for up to 13 variables (including vector B field) sampled at a number of time steps. This is typically visualized using AVS/Express or Visit. There are also a number of post-processors for analyzing data. Typical file sizes range from 10s of MB up to 1 GB. M3D also generates unformatted binary checkpoint files at specified intervals for restart purposes, up to a maximum of three for a single run; each is approximately 50% larger than a single time slice of the HDF5 file.
For the PIC part, the primary output is a single text file containing the particle arrays for particle position, magnetic moment, and parallel velocity. This file is used for restart purposes together with the fluid checkpoint file. The size of this file is typically about ten times the size of the checkpoint file for fluid variables. There is also some output file at multiple time slices during one run for particle distribution function on a 3D grid for visualization purpose.
9. Describe any postprocessors which read the output files: For the PIC part, there are a number of postprocessors for analyzing the particle distribution function.
10. Status and location of code input/output documentation: Code documentation is incomplete. The most complete instructions are currently in the various documents at http://w3.pppl.gov/m3d/reference.html
11. Code web site? http://w3.pppl.gov/m3d/index.php
12. Is code under version control? What system? Is automated regression testing performed? No.
13. One to two paragraph description of equations solved and functionality including what discretizations are used in space and time:
For the fluid part, the basic equations solved are those of resistive MHD: continuity, momentum, Ohm's law, Ampere's law, and an equation of state, typically adiabatic. These are used to advance normalized scalar variables representing number density, a poloidal velocity stream function, a compressible poloidal velocity potential, toroidal velocity, poloidal magnetic flux, toroidal magnetic field, and total pressure. Terms may be dropped from these equations to give ideal or reduced MHD.
The M3D mesh uses cylindrical coordinates (R,\phi,z) and a toroidal topology. The 2D spatial discretization in each constant-\phi plane is by linear finite elements with C0 continuity on an unstructured triangular mesh. Discretization in the toroidal direction is performed using either high-order finite differences or a pseudospectral option. Time discretization can be either first or second order. A semi-implicit method is used for both compressional Alfven waves and shear Alfven waves in the time advance.
For the PIC part, moments are taken of the energetic particle distribution function to compute the divergence of the hot ion stress tensor, which appears in the momentum equation. The particle distribution is evolved using drift-kinetic equation or gyrokinetic equation via PIC method. The simulation particles are advanced and the fast ion stress tensor is evaluated using the same cylindrical coordinates and mesh as those for the fluid equations. The particle domain decomposition is also the same as that of the fluid part.
14. What modes of operation of code are there (e.g.: linear, nonlinear, reduced models, etc ): M3D-K can be run in linear, nonlinear, or equilibrium solver mode, with or without toroidal rotation.
15. Journal references describing code: W. Park, et al., Phys. Plasmas 6, 1796 (1999). G.Y. Fu, et al., Phys. Plasmas 13, 052517 (2006).
16. Codes it is similar to and differences (public version): M3D-K is similar to other hybrid codes such as NIMROD in USA, MEGA in Japan and HMGC in Europe. ALL these are global nonlinear kinetic/MHD codes capable of simulating energetic particle-driven modes in tokamak plasmas. NIMROD uses high order finite elements instead of linear finite elements. MEGA uses finite difference on square (R,Z) grid instead of linear finite elements on unstructured mesh. HMGC uses a reduced MHD model for thermal plasmas instead of full MHD equations.
17. Results of code verification and convergence studies (with references): The calculated growth rate and mode frequency of an n=1 internal kink mode and fishbone compare well with the NOVA-K code. [G.Y. Fu, et al., , Phys. Plasmas 13, 052517 (2006).] The calculated linear growth rate, mode frequency and mode structure of a energetic particle-driven n=1 TAE agree well with the NOVA-K results. The calculated scaling of nonlinear saturation level of an n=1 TAE with collision frequency agrees well with analytic theory. [ J.Y. Lang et al, Phys. Plasmas 17, 042309 (2010)]. The calculated Alfven eigenmodes (both RSAE and TAE) agree well with the NOVA results with respect to mode frequency and mode structure. [J.Y. Lang et al., 2011 TTF meeting, San Diego, April 6-9, 2011].
18. Present and recent applications and validation exercises (with references as available):
Nonlinear hybrid simulations of fishbone instability [G.Y. Fu, et al., Phys. Plasmas 13, 052517 (2006).]
Nonlinear hybrid simulations of multiple beam-driven Alfven modes in NSTX [G.Y. Fu, 49th Annual Meeting of the American Physical Society Division of Plasma Physics, November 12Ð16, 2007; Orlando, Florida, invited talk, paper BI2.00004.]
Nonlinear hybrid simulations of TAE modes with source and sink [J. Lang, et al., Phys. Plasmas 17, 042309 (2010)]
Hybrid simulations of beam-driven Alfven eigenmodes in NSTX [ G.Y. Fu et al., the 2011 TTF meeting, April 6-9, 2011, San Diego].
Hybrid simulations of beam-driven Alfven eigenmodes in DIII-D [ J.Y. Lang et al., the 2011 TTF meeting, April 6-9, 2011, San Diego].
19. Limitations of code parameter regime (dimensionless parameters accessible): Due to the time step restriction and efficiency considerations, total simulated time is limited to no more than a few milliseconds.
20. What third party software is used? (e.g. Meshing software, PETSc, ...): PETSc (including HYPRE), FFTW, HDF5, NetCDF 21. Description of scalability: M3D-K has been demonstrated to have good weak scaling up to ~10,000 processors on the Cray XT4 using 3D domain decomposition. At present, typical runs use up to a few hundred cores with millions of simulation particles for moderate toroidal mode numbers. In future, we expect to use more than 1000 cores efficiently for larger production runs.
22. Major serial and parallel bottlenecks: The majority of execution time in M3D-K is spent in the PIC part. There appears to be no bottlenecks up to 10,000 cores.
23. Are there smaller codes contained in the larger code? Describe: No.
24. Supported platforms and portability Runs on the Cray XT4 and XT5 at NERSC. Limited portability to other systems due to reliance on an old and not-widely-supported version of PETSc. (To be remedied in the near future).
25. Illustrations of time-to-solution on different platforms and for different complexity of physics, if applicable: A typical nonlinear run takes about a few days of wall clock time depending on size of the problem.

M3DK1. Code Name: M3D-K

2. Code Category: Transport, Equilibrium, MHD, GK, Neutrals, RF (etc) 3D Nonlinear Kinetic/MHD hybrid code

3. Primary Developer: Guoyong Fu

4. Other Developers and Users Developer/users: Jianying Lang, Josh Breslau Other users: Feng Wang

5. Short description (one line if possible): M3D-K is a massively parallel kinetic/MHD hybrid code that computes the initial value solutions of the kinetic/MHD equations in toroidal geometry with thermal plasma treated as a single fluid and the energetic particles treated as drift-kinetic or gyrokinetic particles using PIC method. The effects of energetic particles enter via the stress tensor term in the momentum equation. M3D-K is a subset of the multi-level extended MHD 3D code M3D. This document describes mainly the PIC part of M3D. Please refer to the corresponding document of M3D for other details.

6. Computer Language (Fortran77, Fortran90, C, C++, etc) and approx # of lines: Mixture of C (mesh generation, finite element operators, PETSc interface) and Fortran 90 (physics driver) routines. Approximately 55,000 lines of C and 61,000 lines of Fortran code.

7. Type of input required (including files and/or output from other codes). Is there any special input preparation system (e.g., GUI): Requires an input for specifying MHD equilibrium. Also requires a "wxy" file containing input namelists, and optionally a "config.dat" file containing mesh dimensions and domain decomposition. The PIC part requires input for specifying the initial equilibrium distribution of energetic particles.

8. Type of output produced (including files that are read by other codes and sizes of large files and synthetic diagnostics):

For the fluid part, the primary output mechanism is HDF5 file 3d.001.h5, containing 3D field data for up to 13 variables (including vector B field) sampled at a number of time steps. This is typically visualized using AVS/Express or Visit. There are also a number of post-processors for analyzing data. Typical file sizes range from 10s of MB up to 1 GB. M3D also generates unformatted binary checkpoint files at specified intervals for restart purposes, up to a maximum of three for a single run; each is approximately 50% larger than a single time slice of the HDF5 file.

For the PIC part, the primary output is a single text file containing the particle arrays for particle position, magnetic moment, and parallel velocity. This file is used for restart purposes together with the fluid checkpoint file. The size of this file is typically about ten times the size of the checkpoint file for fluid variables. There is also some output file at multiple time slices during one run for particle distribution function on a 3D grid for visualization purpose.

9. Describe any postprocessors which read the output files: For the PIC part, there are a number of postprocessors for analyzing the particle distribution function.

10. Status and location of code input/output documentation: Code documentation is incomplete. The most complete instructions are currently in the various documents at http://w3.pppl.gov/m3d/reference.html

11. Code web site? http://w3.pppl.gov/m3d/index.php

12. Is code under version control? What system? Is automated regression testing performed? No.

13. One to two paragraph description of equations solved and functionality including what discretizations are used in space and time:

For the fluid part, the basic equations solved are those of resistive MHD: continuity, momentum, Ohm's law, Ampere's law, and an equation of state, typically adiabatic. These are used to advance normalized scalar variables representing number density, a poloidal velocity stream function, a compressible poloidal velocity potential, toroidal velocity, poloidal magnetic flux, toroidal magnetic field, and total pressure. Terms may be dropped from these equations to give ideal or reduced MHD.

The M3D mesh uses cylindrical coordinates (R,\phi,z) and a toroidal topology. The 2D spatial discretization in each constant-\phi plane is by linear finite elements with C0 continuity on an unstructured triangular mesh. Discretization in the toroidal direction is performed using either high-order finite differences or a pseudospectral option. Time discretization can be either first or second order. A semi-implicit method is used for both compressional Alfven waves and shear Alfven waves in the time advance.

For the PIC part, moments are taken of the energetic particle distribution function to compute the divergence of the hot ion stress tensor, which appears in the momentum equation. The particle distribution is evolved using drift-kinetic equation or gyrokinetic equation via PIC method. The simulation particles are advanced and the fast ion stress tensor is evaluated using the same cylindrical coordinates and mesh as those for the fluid equations. The particle domain decomposition is also the same as that of the fluid part.

14. What modes of operation of code are there (e.g.: linear, nonlinear, reduced models, etc ): M3D-K can be run in linear, nonlinear, or equilibrium solver mode, with or without toroidal rotation.

15. Journal references describing code: W. Park, et al., Phys. Plasmas

6, 1796 (1999). G.Y. Fu, et al., Phys. Plasmas13, 052517 (2006).16. Codes it is similar to and differences (public version): M3D-K is similar to other hybrid codes such as NIMROD in USA, MEGA in Japan and HMGC in Europe. ALL these are global nonlinear kinetic/MHD codes capable of simulating energetic particle-driven modes in tokamak plasmas. NIMROD uses high order finite elements instead of linear finite elements. MEGA uses finite difference on square (R,Z) grid instead of linear finite elements on unstructured mesh. HMGC uses a reduced MHD model for thermal plasmas instead of full MHD equations.

17. Results of code verification and convergence studies (with references): The calculated growth rate and mode frequency of an n=1 internal kink mode and fishbone compare well with the NOVA-K code. [G.Y. Fu, et al., , Phys. Plasmas

13, 052517 (2006).] The calculated linear growth rate, mode frequency and mode structure of a energetic particle-driven n=1 TAE agree well with the NOVA-K results. The calculated scaling of nonlinear saturation level of an n=1 TAE with collision frequency agrees well with analytic theory. [ J.Y. Lang et al, Phys. Plasmas17, 042309 (2010)]. The calculated Alfven eigenmodes (both RSAE and TAE) agree well with the NOVA results with respect to mode frequency and mode structure. [J.Y. Lang et al., 2011 TTF meeting, San Diego, April 6-9, 2011].18. Present and recent applications and validation exercises (with references as available):

Nonlinear hybrid simulations of fishbone instability [G.Y. Fu, et al., Phys. Plasmas

13, 052517 (2006).]Nonlinear hybrid simulations of multiple beam-driven Alfven modes in NSTX [G.Y. Fu, 49th Annual Meeting of the American Physical Society Division of Plasma Physics, November 12Ð16, 2007; Orlando, Florida, invited talk, paper BI2.00004.]

Nonlinear hybrid simulations of TAE modes with source and sink [J. Lang, et al., Phys. Plasmas

17, 042309 (2010)]Hybrid simulations of beam-driven Alfven eigenmodes in NSTX [ G.Y. Fu et al., the 2011 TTF meeting, April 6-9, 2011, San Diego].

Hybrid simulations of beam-driven Alfven eigenmodes in DIII-D [ J.Y. Lang et al., the 2011 TTF meeting, April 6-9, 2011, San Diego].

19. Limitations of code parameter regime (dimensionless parameters accessible): Due to the time step restriction and efficiency considerations, total simulated time is limited to no more than a few milliseconds.

20. What third party software is used? (e.g. Meshing software, PETSc, ...): PETSc (including HYPRE), FFTW, HDF5, NetCDF 21. Description of scalability: M3D-K has been demonstrated to have good weak scaling up to ~10,000 processors on the Cray XT4 using 3D domain decomposition. At present, typical runs use up to a few hundred cores with millions of simulation particles for moderate toroidal mode numbers. In future, we expect to use more than 1000 cores efficiently for larger production runs.

22. Major serial and parallel bottlenecks: The majority of execution time in M3D-K is spent in the PIC part. There appears to be no bottlenecks up to 10,000 cores.

23. Are there smaller codes contained in the larger code? Describe: No.

24. Supported platforms and portability Runs on the Cray XT4 and XT5 at NERSC. Limited portability to other systems due to reliance on an old and not-widely-supported version of PETSc. (To be remedied in the near future).

25. Illustrations of time-to-solution on different platforms and for different complexity of physics, if applicable: A typical nonlinear run takes about a few days of wall clock time depending on size of the problem.