This page provides access to a GPU accelerated version of the widely used Object Oriented Micromagnetic Framework (OOMMF). The GPU acceleration leads to up to 70x GPU-CPU speed-up with the latest GPUs. GPU OOMMF was developed in collaboration with Dr. Michael J. Donahue, National Institute of Standards and Technology (NIST).

The GPU implementation is such that most of the user-related OOMMF components are unchanged and only the lower-level modules are modified. This allows OOMMF users to run their models as before but at a much increased speed. The current implementation supports two running modes: if no GPU modules are loaded, the code runs purely on CPU as normal CPU OOMMF; otherwise, the heavy computational work is accelerated by GPU while the light-weight front-end work runs on CPU.

The implementation of GPU accelerated OOMMF is open-souce and is available for download. Due to the upgrade of CUDA libraries, some routines used in GPU OOMMF are deprecated and thus modified in a new released version. Downloads for different versions of GPU OOMMF are provided. The original version of GPU OOMMF, which is supported by CUDA 7.5 and below, is available for download here. A modified version of GPU OOMMF, supporting all available CUDA versions, is available for download here. Please do cite the following article if you use GPU-accelerated OOMMF in your research.

S. Fu, W. Cui, M. Hu, R. Chang, M.J. Donahue, V. Lomakin, "Finite Difference Micromagnetic Solvers with Object Oriented Micromagnetic framework (OOMMF) on Graphics Processing Units," IEEE Transactions on Magnetics, vol. 52, no. 4, pp. 1-9, 2016.

Download

Installation

1. Install Nvidia CUDA

2. Download the GPU OOMMF package, unwrap it.

3. Control Panel -> System and Security -> System -> Advanced System Settings -> Advanced -> Environment Variables, add CUDA_HOME = %CUDA_PATH%

4. Type the following commands in Visual Studio x64 Cross Tools Command Prompt

cd %PATH_TO_GPU_OOMMF_DIR%

tclsh oommf.tcl pimake clean

tclsh oommf.tcl pimake

1. Install Nvidia CUDA. Note that in newly released versions of CUDA library(greater or equal to CUDA 8.0), the folders 'include' and 'lib64' under <PATH_TO_CUDA_DIR> may be changed from 'folder' to 'link to folder', which will break down the compilation of GPU OOMMF. A modification that will not influence the function of CUDA library should be made by the following steps if so:

Copy the 'folder's and their content with respect to the path of 'link to folder's to path <PATH_TO_CUDA_DIR>.

Change the name of copied 'folder's to the name of 'link to folder's respectively if different.

Delete the 'link to folder's under path <PATH_TO_CUDA_DIR>.

2. Download the GPU OOMMF package, unwrap it.

3. Type the following commands in a terminal. (The TCL/TK version depends on the version installed in your system.)

export OOMMF_TCL_INCLUDE_DIR=/usr/include/tcl8.6

export OOMMF_TK_INCLUDE_DIR=/usr/include/tcl8.6

export OOMMF_TK_CONFIG=/usr/lib/x86_64-linux-gnu/tk8.6/tkConfig.sh

export OOMMF_TCL_CONFIG=/usr/lib/x86_64-linux-gnu/tcl8.6/tclConfig.sh

export CUDA_HOME=<PATH_TO_CUDA_DIR>

cd <PATH_TO_GPU_OOMMF_DIR>

tclsh oommf.tcl pimake clean

tclsh oommf.tcl pimake

Examples

The GPU OOMMF modules can be activated in the same way as normal CPU OOMMF. The only necessary changes are the class names in the MIF input files. For example, replacing the energy and time evolving specification blocks with the following code in the MuMag standard problem 3 input mif file will launch the GPU blocks.

A complete mif file for GPU OOMMF can be downloaded here. More examples can be found in the GPU OOMMF package.


# Uniaxial anistropy computed by GPU.

Specify GPU_UniaxialAnisotropy_New [subst {

K1 $K1

axis {0 0 1}

}]


# Exchange computed by GPU

Specify Oxs_GPU_UniformExchange_New [subst {

A $A

}]


# Demag computed by GPU

Specify GPU_Demag {}


# Zeeman field computed by GPU

# Unnecessary for standard problem 3, however it is added

# here to show how to activate the Zeeman field on GPU

Specify GPU_FixedZeeman [subst {

multiplier [expr {1/(10000*$mu0)}]

field {0 0 0}

}]


# Time Evolver on GPU

Specify GPU_EulerEvolve {

alpha 0.6

}


# Time Driver on GPU

Specify Oxs_GPU_TimeDriver [subst {

basename prob3

vector_field_output_format {text %.7g}

scalar_output_format %.15g

evolver GPU_EulerEvolve

mesh :mesh

stopping_time 0.5e-9

stopping_dm_dt 0.01

Ms {$Ms}

m0 [list $m0]

}]

Available GPU Modules

Modules In Preparation

Benchmarks

Test problem: cubic magnetic particle

Test environment: Windows (64 bit), Nvidia GTX690 (one GPU device with 2GB memory used; the speed for the latest Titan X is more than 2x higher), Intel Xeon E5-1650@3.2Hz

Metric: CPU Wall Time/Field Evaluation in [ms]

Problem Scale (#Cells) GPU (single prec.) GPU (double prec.) CPU 1-thread (double prec.) CPU 6-thread (double prec.)
4K (163) 0.91 (1.8x) 1.18 (1.4x) 1.65 0.71 (2.3x)
32K (323) 1.59 (8.8x) 2.99 (4.7x) 14.00 5.49 (2.5x)
256K (643) 5.64 (25.0x) 17.58 (8.0x) 140.7 51.56 (2.7x)
2M (1283) 42.75 (29.0x) 148.1 (8.4x) 1242.9 470.1 (2.6x)
4M (128x128x256) 102.2 (34.8x) N/A 3556.4 993.4 (3.6x)
Problem Scale (#Cells) GPU (single prec.) GPU (double prec.) CPU 1-thread (double prec.) CPU 6-thread (double prec.)
4K (163) 0.84 (2.7x) 0.84 (2.7x) 2.24 0.51 (4.4x)
32K (323) 0.94 (19.4x) 2.36 (7.8x) 18.29 3.90 (4.7x)
256K (643) 5.23 (34.3x) 17.35 (10.4x) 179.54 39.29 (4.6x)
2M (1283) 46.63 (31.0x) 153.20 (9.5x) 1450.9 324.9 (4.5x)
4M (128x128x256) 94.58 (31.0x) N/A 3040.9 680.3 (4.5x)