The large energy cost of memory fetches limits the overallefficiency of applications no matter how efficient the ac-celerators are on the chip. As a result the most importantoptimization must be done at the algorithm level, to reduce off-chip memory accesses, to createDark Memory. The algorithmsmust first be (re)written for both locality and parallelism beforeyou tailor the hardware to accelerate them.Using Pareto curves in theenergy/opandmm2/(op/s)spaceallows one to quickly evaluate different accelerators, memorysystems, and even algorithms to understand the trade-offsbetween performance, power and die area. This analysis isa powerful way to optimize chips in the Dark Silicon era.