PQEDZ2GHMD Application Benchmarks

HP ES45/ES80/GS1280, IBM pSeries 690, and SGI/Cray Origin 2000
  updated: April 2003


  Scalability Charts - Description

  Jan-Apr 2003: ES45/ES80/GS1280 AlphaServers, IBM pSeries 690, SGI/Cray Origin 2000
     OpenMP port v.5

  Dec 2002: ES45/ES80 AlphaServers
     OpenMP ports v.2 and v.3

  Nov 2002: ES45/ES80 AlphaServers
     OpenMP ports v.1 and v.2


  General Notes

  PQEDZ2GHMD is the periodic quantum electrodynamics simulator featuring U(1)
  lattice gauge symmetry and an extra four fermi interaction term with Z2
  symmetry added to the QED part to speed the algorithm. HMD stands for Hybrid
  Molecular Dynamics which is the simulation algorithm (a variety of Monte Carlo
  adapted to QED to incorporate its fermions).

  Timing data was gathered for lattice dimensions 16^4, 24^4, and 32^4 with an
  ES45 Model 2 production node (lemieux) and a newer ES80 research node (iceman).
  Results for an SGI/Cray Origin 2000 were added later. Additional information
  regarding these systems is provided below.

  The Alpha machines featured 4 processors per node, so timings were obtained for the
  original serial code and three OpenMP ports running 1, 2, and 4 threads per node.
  For the Origin, the thread count was extended to include 8, 16, and 32 threads.
  Additional values such as "perfect scaling" lines were calculated and plotted.
  Time required to complete the first 1,000 sweeps was considered in all cases.
  Complete simulations involve on the order of a million sweeps.

  These benchmarks are significant for the following reasons:

  o  They demonstrate cases for which an 800-MHz EV7 research system outperforms a
     1000-MHz EV6.8 production platform

  o  They reflect what might be expected of OpenMP ports developed from vector codes

  Differences in scaling properties between the Alpha platforms and OpenMP ports are
  revealed. Comparisons between platforms tend to favor EV6 for the small and medium
  problems, and EV7 for the largest problem.

  Experiments were also performed on an Origin 2000 to test scalability. While porting
  the Alpha code to the MIPS system, a number of small changes were made to improve
  the portability of the resultant source. This became version v.5. Extension of results
  to 32 threads on the Origin system revealed how the code might be expected to scale
  beyond 4 threads on other platforms.

  Regarding absolute performance, reference is made to HPM data recorded some months
  ago while running a similar code on the NPACI T90. Approximately two EV6 processors
  were required to meet the performance level of one T90 cpu. Serial timings for one
  Power4 processor were approximately equal to those for one T90 processor.

  o  Test Platforms
  o  Porting Effort
  o  Results


  PQEDZ2GHMD References

  o  A paper describing scientific results for this model is here.
  o  HP Sciport Library
  o  Alpha EV6 Technical Brief
  o  Alpha EV7 Processor: A High Performance Tradition Continues