New EPSRC project C3: Scalable & Verified Shared Memory

C3: Scalable & Verified Shared Memory via Consistency-directed Cache Coherence

Susmit Sarkar with colleagues in the University of Edinburgh and Intel as project partners, have been successful in their application to the EPSRC for their project C3: Scalable & Verified Shared Memory via Consistency-directed Cache Coherence. This 3 year project starts in July 2015 and aims to realise scalable and verified shared memory.

Shared-memory multi-core processors are ubiquitous, but programming them remains challenging. The programming model exposed by such multi-core processors depends crucially on a “memory consistency model” (MCM), a contract between the hardware and the programmer, which essentially specifies what value a read can return. On the hardware side, one key mechanism to implement the memory consistency model is the “cache-coherence protocol” (CCP), which essentially communicates memory operations between processors. However, the connection between the CCP and the MCM remains unclear. This is especially true for modern CCPs and MCMs, in which CCP design has been divorced from the requirements of the MCM. Susmit and his colleagues argue that this has negatively impacted the scalability and the verifiability of CCPs.

On the scalability front, there are serious question marks about sustaining cache coherence as the number of cores continue to scale. On the verification front, the application of existing verification techniques, which do not verify the CCP against the MCM, are arguably broken.

The C3 proposal, proposes a family of CCPs that are “aware” of, and verified against the MCM. Their approach is motivated by the fact that both hardware and programming languages are converging to various relaxed MCMs for performance oriented reasons. The team use such relaxed MCMs as inspiration to research CCPs that can take advantage of them.

Specifically, they will research “lazy” CCPs where memory operations are batched, and the cost of communicating a memory operation can be amortised. They will also, for the first time, formally verify the relationship between the hardware CCPs and the programmer-oriented MCM they provide. They will investigate rigorously the gains to be had from such lazy CCPs. The team will do this by creating a multi-core silicon prototype of our proposed CCP, leveraging our experience in the design of industrial-strength micro-architectures and their implementations.