GPU /GPGPU /TPU progamming links / SuperComputing at your fingertips
OpenCL is an open general-purpose GPU computing
language. It is an open standard defined by the Khronos Group.
OpenCL provides a cross-platform GPGPU platform that additionally
supports data parallel compute on CPUs. OpenCL is supported on
Intel, AMD, Nvidia, and ARM platforms. The Khronos Group is
currently involved in the development of SYCL, which has its
implementations with ComputeCPP and SYCL STL.
A proprietary framework is Nvidia CUDA. Nvidia started CUDA in
2006, a software development kit (SDK) and application programming
interface (API) that allows using the programming language C to
code algorithms for execution on GeForce 8 series and later
GPUs.
Close to Metal, later called Stream, is AMD's GPGPU technology for ATI Radeon-based GPUs. AMD Stream SDK, was released under AMD EULA in December 2007 after the software stack was rewritten. Stream SDK provides high-level in addition to low-level tools for general-purpose access to AMD graphics hardware. Using GPUs to perform computations holds a lot of potential for some applications because of the fundamental differences of GPU microarchitectures compared to CPUs. GPUs achieve much greater throughput (calculations per second) by executing many programs in parallel and restricting flow control (the ability of one program to execute instructions independently of another). Modern GPUs also have addressable on-die memory and extremely high performance multi-channel external memory. AMD subsequently switched from CTM to OpenCL.
Programming standards for parallel computing
include OpenCL (vendor-independent), OpenACC, and OpenHMPP.
The Xcelerit SDK, created by Xcelerit, is designed to accelerate
large existing C++ or C# code-bases on GPUs with minimal effort. It
provides a simplified programming model, automates parallelisation,
manages devices and memory, and compiles to CUDA binaries.
Additionally, multi-core CPUs and other accelerators can be
targeted from the same source code.
OpenVIDIA was developed at University of Toronto between 2003-2005,
in collaboration with Nvidia.
MATLAB supports GPGPU acceleration using the Parallel Computing
Toolbox and MATLAB Distributed Computing Server, and third-party
packages like Jacket.
GPGPU processing is also used to simulate Newtonian physics by
Physics engines, and commercial implementations include Havok
Physics, FX and PhysX, both of which are typically used for
computer and video games.
C++ Accelerated Massive Parallelism (C++ AMP) is a library that
accelerates execution of C++ code by exploiting the data-parallel
hardware on GPUs.
Altimesh Hybridizer by Altimesh compiles Common
Intermediate Language to CUDA binaries. It supports generics and
virtual functions. Debugging and profiling is integrated to visual
studio and Nsight. It's available as a Visual Studio Extension on
Visual Studio Marketplace.
Microsoft introduced the DirectCompute GPU computing API, released
with the DirectX 11 API.
Alea GPU by QuantAlea introduces native GPU computing capabilities
for the Microsoft .NET language F# and C#. Alea GPU also provides a
simplified GPU programming model based on GPU parallel-for and
parallel aggregate using delegates and automatic memory
management.
GPU-Gems Part I http://developer.nvidia.com/content/gpu-gems-part-i-natural-effects
Pharr, M.: GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley, Boston, MA, 2005 or http://http.developer.nvidia.com/GPUGems2/gpugems2_part01.html or http://www.addison-wesley.de
Nguyen, H.: GPU Gems 3: Programming Techniques for High-Performance Graphics and General-Purpose Computation. Addison-Wesley, Boston, MA, 2007 or http://www.addison-wesley.de or http://http.developer.nvidia.com/GPUGems3/gpugems3_part01.html
Some newer pdf- articles stored locally:
GPU CLUSTER COMPUTING FOR MULTIGRID-FEM SOLVERS WITH APPLICATIONS IN CFD
GPU Simulation and Rendering of Volumetric Effects for Computer Games and Virtual Environments
Higher order FEM numerical integration on GPUs with OpenCL
Implicit FEM and Fluid Coupling on GPU for Interactive Multiphysics Simulation
Fluid–solid coupling on a cluster of GPU graphics cards for seismic wave propagation
Fast seismic modeling and reverse time migration on a GPU cluster
GPU Cluster Computing For Multigrid FEM-Solvers... (abstract)
Assembly of Finite Element Methods on Graphics Processors
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers see also:
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers(2)
Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters Part 2: Applications on GPU Clusters
Exploring weak scalability for FEM calculations on a GPU-enhanced cluster
Accelerating Double Precision FEM Simulations with GPUs
Analyzing CUDA Workloads Using a Detailed GPU Simulator
Automated Finite Element Computations in the FEniCS Framework using GPUs
GPU Cluster Computing for Finite Element Applications
Finite Element Integration on GPUs
Efficient Implementation of Finite Element Operators on GPUs
Massively Parallel Micromagnetic FEM Calculations with Graphical Processing Units (GPUs)
Making Faster FEM Solvers, Faster
general gpu:
General Purpose Computation On Graphics Processing Units
texts in german:
GPU-basierte Verfahren zur interaktiven Simulation und Darstellung von Fluid-Effekten
Implementierung von FEMMethoden auf programmierbaren Grafikkarten
FFT
auf der GPU Von Alexander Kubias
A litle bit older (SOFA see below):
Software:
NVIDIA CUDA (Compute Unified Device Architecture), Nvidia's GPGPU technology for Nvidia GeForce-, Quadro- and Tesla-based GPUs (NVIDIA CUDA german)
Nvidia CUDA Programming Guide for CUDA Toolkit 3.2
http://developer.download.nvidia.com/compute/DevZone/C/html/featured_samples.html
Nvidia Development Whitepapers and Presentations
Nvidia developer resources page
NVIDIA GPU Computing Developer Home Page
Nvidia Free GPU Computing Online Seminars
Nvidia GPU Programming Guide or http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf
Nvidia Tesla C1060/C2050/M2090 (512 CUDA cores, up to 665 gflops) (Overview, Specifications, Drivers & Downloads, ...) or http://www.nvidia.com/docs/IO/43395/tesla_technical_brief.pdf or http://www.nvidia.com/object/tesla_computing_solutions.html M2090: http://www.nvidia.com/docs/IO/43395/Tesla-M2090-Board-Specification.pdf The Next Generation CUDA Architecture, Code Named Fermi (up to 512 CUDA cores). pdf
Nvidia GTX 590 /580 / 570
https://stackoverflow.com/questions/10460742/how-do-cuda-blocks-warps-threads-map-onto-cuda-cores/10467342#10467342
ATI:
Stream, AMD/ATI's GPGPU technology for ATI Radeon-based GPUs
http://ati.amd.com/developer/index.html
AMD Accelerated Parallel Processing (APP) SDK (formerly ATI Stream)
AMD Accelerated Parallel Processing (APP) SDK OpenCL Programming Guide
AMD HD 6990, FireStream 9270 up to 1.2 TFLOPS (single prec.), AMD 5970 up to 928 GFLOPS in double precision
AMD ATI FirePro V7800 (overview, Tecnical Data, ...) or http://www.amd.com/us/products/workstation/graphics/ati-firepro-3d/v7800/pages/v7800.aspx
AMD APP SDK with OpenCL 1.1 Support
Which grafics card to choose - a "best" card does not exist. You got to choose - all or high end or price performance (G3D Mark / $Price)
There are descriptions in the net how to flash a 465 to a 470 (not for the faint at heart - do a back up first, not all cards can flash to a 470!) german description
test your gpu: GPU-Z
GPU Caps Viewer see also here
A SuperComputer at your fingertips? !!!
http://atlasfolding.com/?page_id=148 GPU-Supercomputer mit 30 TFLOPS(german)
SuperComputer with the same performance as a supercomputer cluster consisting of hundreds of PCs
http://www.geek.com/articles/chips/new-fastest-supercomputer-uses-7168-nvidia-gpus-14336-intel-cpus-20101028/ Chinese supercomputer
http://www.dvhardware.net/article27538.html
Microsoft:
DirectCompute Microsoft's GPU Computing API - Initially released with the DirectX 11 API
Microsoft DirectX / DirectCompute or http://www.microsoft.com/games/en-en/aboutgfw/pages/directx.aspx or http://www.nvidia.com/object/cuda_directcompute.html or http://www.nvidia.de/object/directcompute_de.html or http://developer.nvidia.com/category/zone/cuda-zone
Microsoft Parallel Computing Developer Center or here: http://msdn.microsoft.com/en-en/concurrency/default
Intel:
Intel OpenCL SDK (Windows 7 32/64) or http://software.intel.com/en-us/articles/intel-opencl-sdk
Open source:
OpenCL (Open Computing
Language) cross platform GPGPU language for GPUs
(AMD/ATI/Nvidia) and general purpose CPUs
Apple's GPU utilization introduced in Mac OS X v10.6 ‘Snow
Leopard’
Adventures in OpenCL: Part 1, Getting Started
Adventures in OpenCL: Part 1.5, C++ Bindings
Adventures in OpenCL Part 2: Particles with OpenGL
Brown Deer Technology: OpenCL Tutorial: N-Body Simulation.
OpenCV / GpuCV see also: http://opencv.willowgarage.com/wiki
OpenCV / GpuCV links and downl. here
Open MPI: Open Source High Performance Computing.
OpenMP.org: OpenMP Application Program Interface. Version 3.0, May 2008. pdf: http://www.openmp.org/mp-documents/spec30.pdf
Sh, a GPGPU library for C++ BrookGPU is the Stanford University Graphics group's compiler and runtime implementation of the Brook stream programming language. See also here. GLSL Shader Programming Resources
CBC Seminar on GPU Programming and Computing
General-Purpose Computation on Graphics Hardware
GNU Scientific Library (GSL) or http://www.gnu.org/software/gsl/manual/html_node
GPUcomputing.net: Research and development community. TPUhttps://www.sigarch.org/why-the-gpgpu-is-less-efficient-than-the-tpu-for-dnns/ Why the GPGPU is Less Efficient than the TPU for DNNshttps://www.quora.com/How-different-is-a-TPU-from-GPUhttps://timdettmers.com/2018/10/17/tpus-vs-gpus-for-transformers-bert/ GPU Resources
GPUSort: High Performance Sorting using Graphics Processors or http://gamma.cs.unc.edu/GPUSORT/results.html
Mathematica GPU Computing see also: http://reference.wolfram.com/mathematica/ParallelTools/tutorial/Overview.html or here: http://www.nvidia.de/object/cuda-programming-mathematica-de.html MATLAB GPU Computing or here http://www.mathworks.de/discovery/matlab-gpu.html or here http://developer.nvidia.com/object/matlab_cuda.html MIT Open Courseware: Applied Parallel Computing.
MPI standard: The Message Passing Interface Standard.or here http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf or here http://www-unix.mcs.anl.gov/mpi
GMV GMV is no longer available for free and is being commercialized.
Tecplot not free, site licence
ParaView is an open-source, multi-platform data analysis and visualization application.
GeoMesh (131 KB). simple mesh generator
GenMesh (190 KB) more general mesh generator.
Casca
mesh generator (no more avail ? manual here). The casca program
can be used to make a general finite element mesh. This can then be
read into Geocrack2D.
Netgen is a multi-platform automatic mesh generation tool written in C++ capable of generating meshes in two and three dimensions. The program is open source
Tetgen Open source code for generating tetrahedral meshes. Volume mesh created from surface meshes.
Gmsh: a
three-dimensional finite element mesh generator with built-in pre-
and post-processing facilities
LaGriT is a library of user
callable tools that provide mesh generation, mesh optimization and
dynamic mesh maintenance.
List of mesh generators (public domain and comerc.) Another one.
CUBIT (free for governmental use, else comercial) http://www.csimsoft.com/
OpenCTM (last Upd 2010-01-15) OpenCTM is a file format, a software library and a tool set for compression of 3D triangle meshes. The geometry is compressed to a fraction of comparable file formats (3DS, STL, COLLADA...), and the format is accessible through a simple, portable API
Some converters may stll be useful on the old ASME/Mecheng website README, FTP, short description of files
Some Wikipedia pages (select your preferred language):
ATI-Stream: http://en.wikipedia.org/wiki/ATI-Stream
CUDA: http://en.wikipedia.org/wiki/CUDA
DirectCompute: http://en.wikipedia.org/wiki/DirectCompute
DirectX: http://en.wikipedia.org/wiki/Directx
GPGPU: http://en.wikipedia.org/wiki/GPGPU
Grafikprozessor: http://en.wikipedia.org/wiki/Grafikprozessor
OpenCL: http://en.wikipedia.org/wiki/OpenCL
OpenCV: http://en.wikipedia.org/wiki/OpenCV
Fea/Fem packages: wickipedia article
Books:GPU Computing Gems Emerald Edition (Applications of GPU Computing Series) by Wen-mei W. Hwu Hardcover
CUDA by Example: An Introduction to General-Purpose GPU Programming by Jason Sanders Paperback
GPU Pro 2 The Art of Multiprocessor Programming Scientific Computing with Multicore and Accelerators FreeBookCentre.Net
Programming utilitys here