Welcome to SciGPU.org!
This is a website for an emerging community whose shared goal is harnessing the power of general-purpose programming of graphics processing units to accelerate data-intensive science. The Harvard-based SciGPU community shares knowledge through the site and informal seminars, as well as formal collaborations and publications.
|
Downloads
By mark
On 1st February 1, 2010, Won-Ki Jeong, Research Scientist at Harvard IIC/SEAS, gave a SciGPU lunchtime seminar entitled:
“GPU-accelerated Biomedical Image Analysis”
Download his presentation here [20MB].
Abstract
————-
Determining the detailed connections in brain circuits is a fundamental unsolved problem in neuroscience. Understanding this circuitry will enable brain scientists to confirm or refute existing models, develop new ones, and come closer to an understanding of how the brain works. High-resolution, large-scale medical images play a central role in brain analysis and also pose challenging computational problems for 3D segmentation and visualization in terms of developing suitable algorithms, coping with ever-increasing data sizes, and maintaining interactive performance.
In this talk, I will introduce my past and recent research results in GPU-accelerated biomedical image analysis. Specifically, I will talk about the Fast Iterative Method, a parallel algorithm to solve a class of Hamilton-Jacobi equations for weighted distance computation and its application in DT-MRI white matter connectivity analysis. Second, I will introduce NeuroTrace, a GPU-accelerated semi-automatic segmentation and interactive visualization system for processing terabytes of electron microscopy image data, a first step toward the complete reconstruction of neural connections in the mammalian brain.
About the speaker
—————————-
Won-Ki Jeong is a research scientist at the Initiative in Innovative Computing (IIC) in the School of Engineering and Applied Science (SEAS) at Harvard University. His research interests include image processing, scientific visualization, and general purpose computing on the graphics processor in the field of biomedical image analysis. He received a Ph.D. in Computer Science in 2008 from the University of Utah, where he was a member of the Scientific Computing and Imaging (SCI) Institute. He received an NVIDIA Fellowship in 2007. He is currently a professional member of ACM.
Downloads
By mikec
Ron Babich from Boston University gave a talk entitled “Unraveling the Mysteries of Quarks with GPUs” for the IIC SciGPU seminar on February 22nd, 2010. Slides are available here.
Downloads
By admin
Note: The application process is now closed. Thanks for your interest!
SciGPU is pleased to announce summer research opportunities in scientific GPU computing for undergraduates. We seek undergraduates majoring in science and engineering who are interested in developing new algorithms and systems that use GPUs for applications in astronomy, quantum chemistry and neuroscience.
Interested students may apply at www.reusite.seas.harvard.edu/application. These hands-on experiences are best suited to students with programming experience, but it is not necessary to have experience with GPUs. Participants will become part of a large, diverse research community through organized and informal interactions with students, mentors and faculty in the summer intern programs of the Harvard School of Engineering and Applied Sciences.
Students participating in this year’s program, sponsored by the National Science Foundation, will spend June 6 through Aug. 14 on the Harvard campus. They will receive a stipend of $3,900 and a $300 travel allowance as well as on-campus housing at no additional charge.
Information about last year’s program may be found here. Download the attached flyer for additional details.
Downloads
By admin
Mark Silberstein (Technion) gave a SciGPU talk at Harvard entitled “Efficient sum-product computations on GPUs through software-managed cache” on November 23, 2009. His slides are posted here: SumProductHarvard
Downloads
By admin
Dr. Peter Lu (Harvard University, Physics) recently gave a presentation to the SciGPU group based on his work outlined in the journal paper below:
———–
We implement image correlation, a fundamental component of many real-time imaging and tracking systems, on a graphics processing unit (GPU) using NVIDIAs CUDA. We use our code to analyze images of liquid-gas phase separation in a model colloid-polymer system, photographed in the absence of gravity aboard the International Space Station (ISS). Our GPU code is 4000 times faster than simple MATLAB code performing the same calculation on a central processing unit (CPU), 130 times faster than simple C code, and 30 times faster than optimized C++ code using single-instruction, multiple data (SIMD) extensions. The speed increases from these parallel algorithms enable us to analyze images downlinked from the ISS in a rapid fashion and send feedback to astronauts on orbit while the experiments are still being run.
Download PeterLu_BCAT_JRealTimeImageProc_2009.
Quantum chemistry
By admin
Our new article, Accelerating Correlated Quantum Chemistry Calculations Using Graphical Processing Units and a Mixed Precision Matrix Multiplication Library,” by Roberto Olivares-Amaya, Mark A. Watson, Richard G. Edgar, Leslie Vogt, Yihan Shao and Alan Aspuru-Guzik, is now available online at the JCTC website:
http://pubs.acs.org/doi/abs/10.1021/ct900543q
Abstract
Two new tools for the acceleration of computational chemistry codes using graphical processing units (GPUs) are presented. First, we propose a general black-box approach for the efficient GPU acceleration of matrix−matrix multiplications where the matrix size is too large for the whole computation to be held in the GPU’s onboard memory. Second, we show how to improve the accuracy of matrix multiplications when using only single-precision GPU devices by proposing a heterogeneous computing model, whereby single- and double-precision operations are evaluated in a mixed fashion on the GPU and central processing unit, respectively. The utility of the library is illustrated for quantum chemistry with application to the acceleration of resolution-of-the-identity second-order Møller−Plesset perturbation theory calculations for molecules, which we were previously unable to treat. In particular, for the 168-atom valinomycin molecule in a cc-pVDZ basis set, we observed speedups of 13.8, 7.8, and 10.1 times for single-, double- and mixed-precision general matrix multiply (SGEMM, DGEMM, and MGEMM), respectively. The corresponding errors in the correlation energy were reduced from −10.0 to −1.2 kcal mol−1 for SGEMM and MGEMM, respectively, while higher accuracy can be easily achieved with a different choice of cutoff parameter.
Astronomy
By admin
The Murchison Widefield Array is using a real-time GPU correlator to enable engineering and early science for a 5% prototype. Read more about how this system works! See online coverage of the MWA showcasing GPU computing efficiency, as described at the NVIDIA GPU Technology Conference, San Jose 2009. Take a look at the related talk, Diesel-Power GPU Computing.
Downloads
By admin
This is a poster that was recently presented at the NVIDIA GPU Technology Conference (GTC).
Abstract
———
Using the CUDA platform we have implemented a mixed precision Krylov solver for the Wilson-Dirac matrix for lattice QCD. The matrix-vector product which accounts for the vast majority of the operations runs in excess of 130 Gflops in single precision on the GTX 280. We have developed a new approach for mixed-precision Krylov solvers that achieves in excess of 100 Gflops and achieves full double precision accuracy. We also explore the use of half precision in this context to further decrease time to solution. Finally we report on initial findings for extending the problem to multi-GPUs, where we find reasonable performance scaling.
Download: Lattice QCD poster
Downloads
By admin
scigpugemm0.8 – a tarball of the v0.8 release of the SciGPU-GEMM library.
Matrix-matrix multiplications are common in quantum chemistry calculations, and can benefit enormously from GPU acceleration. Although NVIDIA provides an implementation of the BLAS *GEMM routines with its CUDA distribution, two key problems exist when trying to use these from existing code
- Most GPUs in current use have limited memory available
- Few GPUs have double precision hardware available
Although these problems will not usually be encountered when using research clusters, code running on distributed clients (such as BOINC) cannot assume that a large-memory double precision GPU will be available. The SciGPU-GEMM library was written to alleviate these difficulties. The library contains three principal routines
- dgemm_cleaver
- A DGEMM implementation which will split the input matrices into pieces small enough to fit onto the GPU
- sgemm_cleaver
- The same, but for SGEMM
- mgemm
- A multi-precision matrix-matrix multiplication routine
The last routine splits matrices into ’small’ and ‘large’ portions. The ’small’ portions are handled in single precision on the GPU, while the CPU handles the ‘large’ portions in double precision.
Quantum chemistry
By admin
 Taxol speedup
In our recently submitted paper (R. Olivares-Amaya et al, JCTC), the Alan Aspuru-Guzik group has presented a new implementation of the quantum chemistry method RI-MP2 (resolution-of-the-identity second-order Møller-Plesset perturbation theory) accelerated using GPUs and the MGEMM library published on this website. For the 168-atom valinomycin molecule in a cc-pVDZ basis set, we observed speedups relative to CPU double-precision results of 13.8x, 7.8x and 10.1x using single-, double- and our new mixed-precision general matrix multiply (SGEMM, DGEMM and MGEMM), respectively. The corresponding errors in the correlation energy were reduced from -10.0 kcal/mol to -1.2 kcal/mol for SGEMM and MGEMM, thus achieving our goal of `chemical accuracy’. In the figure above, speedups for the anti-cancer drug Taxol are shown, up to a factor of 10 relative to using DGEMM on the CPU, as well as the error control for different choices of MGEMM cut-off parameter. In comparison, the SGEMM error for Taxol is approximately 6.6 kcal/mol.
|
|