Facets (new session)
Description
Metadata
Settings
owl:sameAs
Inference Rule:
b3s
b3sifp
dbprdf-label
facets
http://dbpedia.org/resource/inference/rules/dbpedia#
http://dbpedia.org/resource/inference/rules/opencyc#
http://dbpedia.org/resource/inference/rules/umbel#
http://dbpedia.org/resource/inference/rules/yago#
http://dbpedia.org/schema/property_rules#
http://www.ontologyportal.org/inference/rules/SUMO#
http://www.ontologyportal.org/inference/rules/WordNet#
http://www.w3.org/2002/07/owl#
ldp
oplweb
skos-trans
virtrdf-label
None
About:
Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)(TM) Streaming-Aggregation Hardware Design and Evaluation
Goto
Sponge
NotDistinct
Permalink
An Entity of Type :
schema:ScholarlyArticle
, within Data Space :
wasabi.inria.fr
associated with source
document(s)
Type:
Academic Article
research paper
schema:ScholarlyArticle
New Facet based on Instances of this Class
Attributes
Values
type
Academic Article
research paper
schema:ScholarlyArticle
isDefinedBy
Covid-on-the-Web dataset
has title
Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)(TM) Streaming-Aggregation Hardware Design and Evaluation
Creator
Cho, David
Klein, Daniel
Bloch, Gil
Burredy, Devendar
Elias, George
Graham, Richard
Ladd, Joshua
Levi, Lion
Maor, Ophir
Marelli, Ami
Petrov, Valentin
Qin, Yong
Romlet, Evyatar
Shainer, Gilad
Zemah, Ido
Source
PMC
abstract
This paper describes the new hardware-based streaming-aggregation capability added to Mellanox’s Scalable Hierarchical Aggregation and Reduction Protocol in its HDR InfiniBand switches. For large messages, this capability is designed to achieve reduction bandwidths similar to those of point-to-point messages of the same size, and complements the latency-optimized low-latency aggregation reduction capabilities, aimed at small data reductions. MPI_Allreduce() bandwidth measured on an HDR InfiniBand based system achieves about 95% of network bandwidth. For medium and large data reduction this also improves the reduction bandwidth by a factor of 2–5 relative to host-based (e.g., software-based) reduction algorithms. Using this capability also increased DL-Poly and PyTorch application performance by as much as 4% and 18%, respectively. This paper describes SHARP Streaming-Aggregation hardware architecture and a set of synthetic and application benchmarks used to study this new reduction capability, and the range of data sizes for which Streaming-Aggregation performs better than the low-latency aggregation algorithm.
has issue date
2020-05-22
(
xsd:dateTime
)
bibo:doi
10.1007/978-3-030-50743-5_3
has license
cc-by
sha1sum (hex)
3ed10468b420b300a23737e958a8aad5624554d0
schema:url
https://doi.org/10.1007/978-3-030-50743-5_3
resource representing a document's title
Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)(TM) Streaming-Aggregation Hardware Design and Evaluation
has PubMed Central identifier
PMC7295336
schema:publication
High Performance Computing
resource representing a document's body
covid:3ed10468b420b300a23737e958a8aad5624554d0#body_text
is
schema:about
of
named entity 'PyTorch'
named entity 'medium'
named entity 'network bandwidth'
covid:arg/3ed10468b420b300a23737e958a8aad5624554d0
named entity 'MPI'
named entity 'algorithms'
named entity 'low-latency'
named entity 'InfiniBand'
named entity 'This'
named entity 'Mellanox'
named entity 'HDR'
named entity 'bandwidths'
named entity 'Reduction'
named entity 'low-latency'
named entity 'bandwidth'
named entity 'bandwidth'
named entity 'software-based'
named entity 'algorithm'
named entity 'InfiniBand'
named entity 'MPI'
named entity 'Hardware Design'
named entity 'PCI bus'
named entity 'algorithm'
named entity 'ping-pong'
named entity 'radix'
named entity 'MPI'
named entity 'unicast'
named entity 'PCIe'
named entity 'ping-pong'
named entity 'orders of magnitude'
named entity 'low-latency'
named entity 'High Performance Computing'
named entity 'Mellanox'
named entity 'vector'
named entity 'data throughput'
named entity 'vector'
named entity 'low-latency'
named entity 'multicast'
named entity 'data distribution'
named entity 'InfiniBand'
named entity 'Mellanox'
named entity 'vectors'
named entity 'Mellanox'
named entity 'collective operation'
named entity 'short message'
named entity 'PCIe'
named entity 'point-to-point'
named entity 'throughput'
named entity 'PCI bus'
named entity 'CPU'
named entity 'Mellanox'
named entity 'InfiniBand'
named entity 'message size'
named entity 'firmware'
named entity 'MTU'
named entity 'latency'
named entity 'wire speed'
named entity 'asynchronous'
named entity 'bandwidth'
named entity 'low-latency'
named entity 'half and half'
named entity 'latency'
named entity 'sec'
named entity 'radix'
named entity 'network bandwidth'
named entity 'electrostatics'
named entity 'network bandwidth'
named entity 'vectors'
named entity 'MByte'
◂◂ First
◂ Prev
Next ▸
Last ▸▸
Page 1 of 6
Go
Faceted Search & Find service v1.13.91 as of Mar 24 2020
Alternative Linked Data Documents:
Sponger
|
ODE
Content Formats:
RDF
ODATA
Microdata
About
OpenLink Virtuoso
version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software