High Performance Computing (HPC) – generally speaking – refers to a practice of aggregating computing power in order to deliver much higher performance. The performance improvement is measured by comparing the results achieved by such combined-computing-powers with the ones you could get out of a typical desktop computer or a workstation while solving large problems in science, engineering, or business. HPC allows us to solve the most advanced problems using massively parallel platforms (i.e. those listed at www.top500.org). Sample challenges for HPC include:
- weather forecasting,
- molecular biosciences,
- modeling the motion of astronomical bodies,
- chemical, thermal systems etc.
Basic idea: if you have a problem that takes 100 hours on a single CPU, why not use a combination of multiple CPUs + accelerators like GPUs and solve it in a fraction of an hour or even a minute?
At byteLAKE, we have a massive experience in porting, developing and optimizing (i.e. performance, energy efficiency, accuracy of calculations) software for supercomputer platforms. Our well recognized PhD researchers are doing pioneering work on advanced methods for parallel software design. While creating new methods for code and performance portability, and combining these with the best practices in the field, we help our clients develop efficient software across wide range of computational platforms. In the context of artificial intelligence, HPC is crucial during complex training phases of neural networks. Find out more in our presentation at SlideShare.
- Code adaptation to a variety of computing platforms (i.e. server/desktop GPUs, multicore CPUs, mobile platforms)
- Optimization based on three criterion: performance, energy efficiency (Green Computing), accuracy of calculations
- Profiling and analysis:
- selecting the most time consuming part of the application,
- diagnosing bottlenecks, load unbalancing, idle resources,
- rebuilding code by using different methods, techniques, algorithms.
- End-to-end HPC solutions: fully configured hardware and a highly optimized custom software application. Read more here.
Techniques and methods highlights:
- Software auto tuning – design and development of self adapting codes for computing architectures
- DVFS – dynamic voltage and frequency scaling
- CT – Concurrency throttling
- Mixed precision arithmetic
- Scalability improvement across all the cluster resources
- Common techniques: data align, blocking, overlapping, streaming, regs. queue, …
- MPI – Message Passing Interface
- CUDA – Compute Unified Device Architecture
- OpenCL – Open Computing Language
- OpenMP – Open Multi-Processing
- C, C++, C++-11
- Pthreads – POSIX Threads
- and others
We understand heterogeneous computing not only to as a type of computations performed using GPU/Intel Xeon Phi architecture and having CPU as a management layer. Our company focuses on utilizing all the computing resources to unleash the power of the accelerators (GPU/Intel Xeon Phi) and host processors (CPUs). In this approach we optimize the following aspects:
- selecting the right programming model to a given problem (task parallelism, data parallelism, mixture of those two),
- providing the right balance between CPUs and GPUs which are processing the algorithm with different performance,
- optimizing the data transfers between the host memory and accelerators.
Struggling with building optimized software for HPC systems? Want to port, re-design or optimize your applications to make the most of the massively parallel computing platform?