000189273 001__ 189273
000189273 005__ 20210129215332.0
000189273 037__ $$aFZJ-2015-02451
000189273 1001_ $$0P:(DE-Juel1)132142$$aHomberg, Wilhelm$$b0$$eCorresponding Author$$ufzj
000189273 1112_ $$aSHARE Europe (SEAS) Spring Meeting 1991$$cLausanne$$d1991-04-08 - 1991-04-12$$wSwitzerland
000189273 245__ $$aIBM 3090 Memory Access: Measurement and Simulation
000189273 260__ $$c1991
000189273 29510 $$aProceedings ot the SHARE Europe (SEAS) Spring Meeting 1991
000189273 300__ $$a14 p.
000189273 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1428933195_30910
000189273 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
000189273 3367_ $$033$$2EndNote$$aConference Paper
000189273 3367_ $$2ORCID$$aCONFERENCE_PAPER
000189273 3367_ $$2DataCite$$aOutput Types/Conference Paper
000189273 3367_ $$2DRIVER$$aconferenceObject
000189273 3367_ $$2BibTeX$$aINPROCEEDINGS
000189273 520__ $$aThe imbalance between processor speed and memory access time is one characteristic issue of modern high-speed computers sometimes leading to a bottleneck for algorithms with a great amount of memory traffic. Different architectural concepts are provided to diminish this effect. One attempt to reach this goal is the implementation of a hierarchical memory structure. On IBM computers, this hierarchy includes CPU, cache (high-speed buffer), main memory, and paging devices (e.g. expanded storage and disks).A model focusing on the memory architecture of the IBM 3090 family is developed and the behavior of sequential algorithms with respect to cache and translation lookaside buffer (TLB) is considered. For a given sequence of memory references the numbers of cache misses and TLB misses are calculated and the amount of CPU time consumed by these effects can be estimated. Parameters describing the buffer size, line length, the number of sets, and different replacement algorithms are taken into account.The model is verified by studying the performance of application-oriented programs from numerical linear algebra with a known memory access pattern. Jobs were run on IBM 3090 computers differing in cache size and processor cycle time. The measured CPU time consumption is compared with the predicted results showing that measurement and simulation are in good agreement.
000189273 536__ $$0G:(DE-HGF)POF2-899$$a899 - ohne Topic (POF2-899)$$cPOF2-899$$fPOF I$$x0
000189273 7001_ $$0P:(DE-Juel1)130461$$aHake, Jürgen-Fr.$$b1$$ufzj
000189273 7001_ $$0P:(DE-Juel1)132121$$aGürich, Wolfgang$$b2$$ufzj
000189273 909CO $$ooai:juser.fz-juelich.de:189273$$pVDB
000189273 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132142$$aForschungszentrum Jülich GmbH$$b0$$kFZJ
000189273 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)130461$$aForschungszentrum Jülich GmbH$$b1$$kFZJ
000189273 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132121$$aForschungszentrum Jülich GmbH$$b2$$kFZJ
000189273 9132_ $$0G:(DE-HGF)POF3-899$$1G:(DE-HGF)POF3-890$$2G:(DE-HGF)POF3-800$$aDE-HGF$$bForschungsbereich Materie$$lForschungsbereich Materie$$vohne Topic$$x0
000189273 9131_ $$0G:(DE-HGF)POF2-899$$1G:(DE-HGF)POF2-890$$2G:(DE-HGF)POF2-800$$3G:(DE-HGF)POF2$$4G:(DE-HGF)POF$$aDE-HGF$$bProgrammungebundene Forschung$$lohne Programm$$vohne Topic$$x0
000189273 9201_ $$0I:(DE-Juel1)VDB62$$kZAM$$lZentralinstitut für Angewandte Mathematik$$x0
000189273 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x1
000189273 980__ $$acontrib
000189273 980__ $$aVDB
000189273 980__ $$acontb
000189273 980__ $$aI:(DE-Juel1)VDB62
000189273 980__ $$aI:(DE-Juel1)JSC-20090406
000189273 980__ $$aUNRESTRICTED
000189273 981__ $$aI:(DE-Juel1)JSC-20090406