GPU /GPGPU /TPU progamming links / SuperComputing at your fingertips
( What is GPU Computing? )
OpenCL is an open general-purpose GPU computing
language. It is an open standard defined by the Khronos Group.
OpenCL provides a cross-platform GPGPU platform that additionally
supports data parallel compute on CPUs. OpenCL is supported on
Intel, AMD, Nvidia, and ARM platforms. The Khronos Group is
currently involved in the development of SYCL, which has its
implementations with ComputeCPP and SYCL STL.
A proprietary framework is Nvidia CUDA. Nvidia started CUDA in 2006, a software development kit (SDK) and application programming interface (API) that allows using the programming language C to code algorithms for execution on GeForce 8 series and later GPUs.
Close to Metal, later called Stream, is AMD's GPGPU technology for ATI Radeon-based GPUs. AMD Stream SDK, was released under AMD EULA in December 2007 after the software stack was rewritten. Stream SDK provides high-level in addition to low-level tools for general-purpose access to AMD graphics hardware. Using GPUs to perform computations holds a lot of potential for some applications because of the fundamental differences of GPU microarchitectures compared to CPUs. GPUs achieve much greater throughput (calculations per second) by executing many programs in parallel and restricting flow control (the ability of one program to execute instructions independently of another). Modern GPUs also have addressable on-die memory and extremely high performance multi-channel external memory. AMD subsequently switched from CTM to OpenCL.
Programming standards for parallel computing
include OpenCL (vendor-independent), OpenACC, and OpenHMPP.
The Xcelerit SDK, created by Xcelerit, is designed to accelerate large existing C++ or C# code-bases on GPUs with minimal effort. It provides a simplified programming model, automates parallelisation, manages devices and memory, and compiles to CUDA binaries. Additionally, multi-core CPUs and other accelerators can be targeted from the same source code.
OpenVIDIA was developed at University of Toronto between 2003-2005, in collaboration with Nvidia.
MATLAB supports GPGPU acceleration using the Parallel Computing Toolbox and MATLAB Distributed Computing Server, and third-party packages like Jacket.
GPGPU processing is also used to simulate Newtonian physics by Physics engines, and commercial implementations include Havok Physics, FX and PhysX, both of which are typically used for computer and video games.
C++ Accelerated Massive Parallelism (C++ AMP) is a library that accelerates execution of C++ code by exploiting the data-parallel hardware on GPUs.
Altimesh Hybridizer by Altimesh compiles Common
Intermediate Language to CUDA binaries. It supports generics and
virtual functions. Debugging and profiling is integrated to visual
studio and Nsight. It's available as a Visual Studio Extension on
Visual Studio Marketplace.
Microsoft introduced the DirectCompute GPU computing API, released with the DirectX 11 API.
Alea GPU by QuantAlea introduces native GPU computing capabilities for the Microsoft .NET language F# and C#. Alea GPU also provides a simplified GPU programming model based on GPU parallel-for and parallel aggregate using delegates and automatic memory management.
GPU-Gems Part I http://developer.nvidia.com/content/gpu-gems-part-i-natural-effects
Pharr, M.: GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley, Boston, MA, 2005 or http://http.developer.nvidia.com/GPUGems2/gpugems2_part01.html or http://www.addison-wesley.de
Nguyen, H.: GPU Gems 3: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley, Boston, MA, 2007 or http://www.addison-wesley.de or http://http.developer.nvidia.com/GPUGems3/gpugems3_part01.html
Some newer pdf- articles stored locally:
GPU CLUSTER COMPUTING FOR MULTIGRID-FEM SOLVERS WITH APPLICATIONS IN CFD
GPU Simulation and Rendering of Volumetric Effects for Computer Games and Virtual Environments
Higher order FEM numerical integration on GPUs with OpenCL
Implicit FEM and Fluid Coupling on GPU for Interactive Multiphysics Simulation
Fluid–solid coupling on a cluster of GPU graphics cards for seismic wave propagation
Fast seismic modeling and reverse time migration on a GPU cluster
GPU Cluster Computing For Multigrid FEM-Solvers... (abstract)
Assembly of Finite Element Methods on Graphics Processors
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers see also:
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers(2)
Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters Part 2: Applications on GPU Clusters
Exploring weak scalability for FEM calculations on a GPU-enhanced cluster
Accelerating Double Precision FEM Simulations with GPUs
Analyzing CUDA Workloads Using a Detailed GPU Simulator
Automated Finite Element Computations in the FEniCS Framework using GPUs
GPU Cluster Computing for Finite Element Applications
Finite Element Integration on GPUs
Efficient Implementation of Finite Element Operators on GPUs
Massively Parallel Micromagnetic FEM Calculations with Graphical Processing Units (GPUs)
Making Faster FEM Solvers, Faster
General Purpose Computation On Graphics Processing Units
texts in german:
GPU-basierte Verfahren zur interaktiven Simulation und Darstellung von Fluid-Effekten
Implementierung von FEMMethoden auf programmierbaren Grafikkarten
auf der GPU Von Alexander Kubias
A litle bit older (SOFA see below):
Efficient nonlinear FEM for soft tissue modelling and its GPU implementation within the open source framework SOFA
NVIDIA CUDA (Compute Unified Device Architecture), Nvidia's GPGPU technology for Nvidia GeForce-, Quadro- and Tesla-based GPUs (NVIDIA CUDA german)
Nvidia CUDA Programming Guide for CUDA Toolkit 3.2
Nvidia Developer Web Site
Nvidia Development Whitepapers and Presentations
Nvidia developer resources page
NVIDIA GPU Computing Developer Home Page
Nvidia Free GPU Computing Online Seminars
Nvidia GPU Programming Guide or http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf
Nvidia Tesla C1060/C2050/M2090 (512 CUDA cores, up to 665 gflops) (Overview, Specifications, Drivers & Downloads, ...) or http://www.nvidia.com/docs/IO/43395/tesla_technical_brief.pdf or http://www.nvidia.com/object/tesla_computing_solutions.html M2090: http://www.nvidia.com/docs/IO/43395/Tesla-M2090-Board-Specification.pdf The Next Generation CUDA Architecture, Code Named Fermi (up to 512 CUDA cores). pdf
Nvidia GTX 590 /580 / 570
Stream, AMD/ATI's GPGPU technology for ATI Radeon-based GPUs
AMD Accelerated Parallel Processing (APP) SDK (formerly ATI Stream)
AMD Accelerated Parallel Processing (APP) SDK OpenCL Programming Guide
AMD HD 6990, FireStream 9270 up to 1.2 TFLOPS (single prec.), AMD 5970 up to 928 GFLOPS in double precision
AMD ATI FirePro V7800 (overview, Tecnical Data, ...) or http://www.amd.com/us/products/workstation/graphics/ati-firepro-3d/v7800/pages/v7800.aspx
AMD APP SDK with OpenCL 1.1 Support
Which grafics card to choose - a "best" card does not exist. You got to choose - all or high end or price performance (G3D Mark / $Price)
There are descriptions in the net how to flash a 465 to a 470 (not for the faint at heart - do a back up first, not all cards can flash to a 470!) german description
test your gpu: GPU-Z
GPU Caps Viewer see also here
A SuperComputer at your fingertips? !!!
http://atlasfolding.com/?page_id=148 GPU-Supercomputer mit 30 TFLOPS(german)
SuperComputer with the same performance as a supercomputer cluster consisting of hundreds of PCs
http://www.geek.com/articles/chips/new-fastest-supercomputer-uses-7168-nvidia-gpus-14336-intel-cpus-20101028/ Chinese supercomputer
DirectCompute Microsoft's GPU Computing API - Initially released with the DirectX 11 API
Microsoft DirectX / DirectCompute or http://www.microsoft.com/games/en-en/aboutgfw/pages/directx.aspx or http://www.nvidia.com/object/cuda_directcompute.html or http://www.nvidia.de/object/directcompute_de.html or http://developer.nvidia.com/category/zone/cuda-zone
Microsoft Parallel Computing Developer Center or here: http://msdn.microsoft.com/en-en/concurrency/default
Intel OpenCL SDK (Windows 7 32/64) or http://software.intel.com/en-us/articles/intel-opencl-sdk
Intel C/C++ Compiler
OpenCL (Open Computing
Language) cross platform GPGPU language for GPUs
(AMD/ATI/Nvidia) and general purpose CPUs
Apple's GPU utilization introduced in Mac OS X v10.6 ‘Snow Leopard’
Adventures in OpenCL: Part 1, Getting Started
Adventures in OpenCL: Part 1.5, C++ Bindings
Adventures in OpenCL Part 2: Particles with OpenGL
Brown Deer Technology: OpenCL Tutorial: N-Body Simulation.
OpenCL Programming Guide
OpenCL Quick Reference Card
OpenCV / GpuCV see also: http://opencv.willowgarage.com/wiki
OpenCV / GpuCV links and downl. here
OpenGL and OpenCL Debugger
Open MPI: Open Source High Performance Computing.
OpenMP.org: OpenMP Application Program Interface. Version 3.0, May 2008. pdf: http://www.openmp.org/mp-documents/spec30.pdf
Sh, a GPGPU library for C++ BrookGPU is the Stanford University Graphics group's compiler and runtime implementation of the Brook stream programming language. See also here. GLSL Shader Programming Resources
CBC Seminar on GPU Programming and Computing
General-Purpose Computation on Graphics Hardware
GNU Scientific Library (GSL) or http://www.gnu.org/software/gsl/manual/html_node
GPUcomputing.net: Research and development community. TPUhttps://www.sigarch.org/why-the-gpgpu-is-less-efficient-than-the-tpu-for-dnns/ Why the GPGPU is Less Efficient than the TPU for DNNshttps://www.quora.com/How-different-is-a-TPU-from-GPUhttps://timdettmers.com/2018/10/17/tpus-vs-gpus-for-transformers-bert/ GPU Resources
GPUSort: High Performance Sorting using Graphics Processors or http://gamma.cs.unc.edu/GPUSORT/results.html
Mathematica GPU Computing see also: http://reference.wolfram.com/mathematica/ParallelTools/tutorial/Overview.html or here: http://www.nvidia.de/object/cuda-programming-mathematica-de.html MATLAB GPU Computing or here http://www.mathworks.de/discovery/matlab-gpu.html or here http://developer.nvidia.com/object/matlab_cuda.html MIT Open Courseware: Applied Parallel Computing.
MPI standard: The Message Passing Interface Standard.or here http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf or here http://www-unix.mcs.anl.gov/mpi
GPU Floating-Point Paranoia
Salome pre- & postprocessor
GMV GMV is no longer available for free and is being commercialized.
Tecplot not free, site licence
VTK The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization.
VTKEdge library of advanced visualization and data processing techniques that complement the Visualization Toolkit.
ParaView is an open-source, multi-platform data analysis and visualization application.
Visit VisIt is a free interactive parallel visualization and graphical analysis tool for viewing scientific data on Unix and PC platforms
GeoMesh (131 KB). simple mesh generator
GenMesh (190 KB) more general mesh generator.
Casca mesh generator (no more avail ? manual here). The casca program can be used to make a general finite element mesh. This can then be read into Geocrack2D.
Netgen is a multi-platform automatic mesh generation tool written in C++ capable of generating meshes in two and three dimensions. The program is open source
Tetgen Open source code for generating tetrahedral meshes. Volume mesh created from surface meshes.
three-dimensional finite element mesh generator with built-in pre-
and post-processing facilities
LaGriT is a library of user callable tools that provide mesh generation, mesh optimization and dynamic mesh maintenance.
List of mesh generators (public domain and comerc.) Another one.
CUBIT (free for governmental use, else comercial) http://www.csimsoft.com/
OpenCTM (last Upd 2010-01-15) OpenCTM is a file format, a software library and a tool set for compression of 3D triangle meshes. The geometry is compressed to a fraction of comparable file formats (3DS, STL, COLLADA...), and the format is accessible through a simple, portable API
Some converters may stll be useful on the old ASME/Mecheng website README, FTP, short description of files
Some Wikipedia pages (select your preferred language):
Fea/Fem packages: wickipedia article
GPU Computing Gems Emerald Edition (Applications of GPU Computing Series) by Wen-mei W. Hwu Hardcover
CUDA by Example: An Introduction to General-Purpose GPU Programming by Jason Sanders Paperback
Programming Massively Parallel Processors: A Hands-on Approach (Applications of GPU Computing Series) by David B. Kirk PaperbackGPU Pro 2 The Art of Multiprocessor Programming Scientific Computing with Multicore and Accelerators FreeBookCentre.Net
Programming utilitys here