| XStream 3.0 Hardware FDTD |
XStream® Hardware FDTD uses the latest GPU Computing technology to provide dramatic speed increases in XFdtd® calculation. Utilizing the ability of the GPU (Graphics Processing Unit) in modern computer graphics cards to stream floating point calculations, XFdtd achieves extremely fast calculation speeds via the XStream Hardware FDTD option. XStream Hardware FDTD is now based on the NVIDIA FX 5600 GPU with 1.5 GBytes of accelerated memory. There are three versions of XStream 3.0 available from Remcom: The basic XStream Single GPU, the XStream MicroCluster, and the XStream MiniCluster. XStream provides the calculation speed of a computer cluster at a fraction of the cost. Details of each version of XStream are below.
There are some XFdtd features currently not
available for the XStream FDTD cards. Speed ComparisonsHow fast is Version 3.0 XStream Hardware FDTD? Results depend on the size of the FDTD mesh, but for calculations that fit within the memory constraints of the GPUs, 3Gbytes for a Micro-Cluster and 6 GBytes, for Mini-Cluster, calculation times are on the order of, or faster than, a 32 node computer cluster and much faster than running the calculation on even a Dual Processor Dual Core computer work station. This is illustrated here using a horn antenna geometry meshed with two different cell sizes, one with about 64 Million mesh cells requiring about 2.7 GBytes for calculation, and again with about 146 million mesh cells requiring about 5.7 GBytes for calculation. XFdtd calculations for these two meshes were made on a computer work station with two AMD Opteron 2216 Dual Core processors. A baseline calculation was made with a single processor using one core, another with two processors using both cores in each processor. The smaller mesh was also run on an XStream Micro-Cluster, and both meshes were run on an XStream Mini-Cluster. The relative calculation speeds for the different hardware choices for each mesh, normalized to one processor running with one core, are shown in the following graph normalized to the single processor-single core calculation time. The increase in computation speed with the Mini-Cluster is nearly 60 for the larger mesh and almost 35 for the smaller mesh. For the smaller mesh the Micro-Cluster provides a speed increase of 29. Even relative to full utilization of the full Dual Processor Dual Core workstation hardware, the Mini-Cluster provides a speedup of over 16 times for the larger mesh. So a calculation requiring 8 hours on the Dual Processor Dual Core work station is completed in less than 30 minutes on a Mini-Cluster. The potential savings in engineering time alone justify purchase of the appropriate Micro- or Mini-Cluster. |