By using this site, you agree to our Privacy Policy and our Terms of Use. Close


jlauro said:

This idea will not work with only a low number of cpus, and it also will not work well with multiple physical cpus, as it does assume uniform memory access (which is not easily done with a large number of cores without using a cross-bar switch), and you were probably working with a non uniform memory architecture, or else a small number of processors. Unless you put everything on one chip, including enough memory (4k minimum per thread, 64+k better) to handle a large number of threads, and enough cores, you just will run out of memory bandwidth...

Well if you believe intel then Larrabee will have UMA with a ring-bus and the CBE can actually be put in cache-coherence mode it's just that nobody does because performance takes a huge hit. I would agree though, that it is ideally done with a cross-bar.

I'm not going to touch the QoS stuff with a ten-foot pole. It's been a few years now but at the time actual bounds were not so hot and involved heaps of queueing theory. Although it depends are what you are trying to maximise throughput, fairness, or meeting deadlines.