Experimental Mini-Cluster

Experimental Mini-Cluster to Support Vlsi and Architecture Classes
Sandeep Gupta, Collaborators: Woojoo Lee, Doochul Shin
Sponsored: Spring 2011
Developments to date: Using the MHI funds, we have acquired an eight-node cluster.

Background: We had proposed to acquire, configure, and manage a mini-cluster, on an experimental basis. The objectives of our proposed experiments were: (i) to demonstrate that it is beneficial – in terms of providing students with upgraded tool versions, libraries, and technologies, and (ii) to study the feasibility of installation, troubleshooting, and upkeep of our own cluster using a combination of available resources, namely an external vendor for installation and occasional major repairs, a fraction of a departmental IT staff member’s effort, and a designated TA for tools used in our classes.

Developments to date: Using the MHI funds, we have acquired an eight-node cluster. This cluster was successfully installed by an outside vendor under the guidance of Murali Annavaram and with significant help from two of his senior doctoral students. The cluster became available somewhere in the middle of Spring 2011. (Our new machine has been named Dhaulgiri.) Within a couple of weeks our Tools TA, Woojoo Lee, who is funded by the department, was able to successfully install all Cadence tools that we use in our VLSI Design classes, especially EE 577a. He was also able to install the associated utilities, libraries, and technology files. Since it was late in the semester, we did not make this cluster and tools available to the students enrolled in EE 577a. Instead, since ITS’s Student Computing Facility (SCF) cluster (which now has only two main machines, aludra and nunki) had become extremely slow to the point where we were forced to assign to students in our 577a class a design assignment that was eight times smaller than what we had assigned in Spring 2009, we carried out extensive benchmarking on SCF and our new CEng mini-cluster. The benchmarking clearly demonstrated that the SCF machines are inadequate since our tasks take nine-times longer to complete on SCF compared to our $7,500 CEng mini-cluster. We presented this data at Viterbi IT Advisory Council and shared the information with ITS staff and administrators who are in-charge of SCF. (The slides presented at the Viterbi IT Advisory Council are enclosed. Recall that the CEng mini-cluster is referred to as Dhaulgiri.) After over a year of discussions with ITS, which had produced no concrete results, the results of above mentioned benchmarking effort finally convinced ITS that they need to make serious changes to SCF. The series of meetings that will plan these changes should occur over the next few weeks. We are hopeful that the new and improved SCF will be in place by the end of September, by the time the new students in VLSI classes will get their first design assignments.

Our ongoing plans for this cluster: First, we will continue to use the CEng mini-cluster for VLSI classes to keep ITS on its toes with respect to SCF upgrades. In Fall 2011, we will provide a randomly selected subset of students in participating classes access to this cluster and tools. (They will retain their access to SCF machine like their other classmates but encouraged to use this cluster.) We will then continually survey all students enrolled in participating classes to assess the relative benefits of the mini cluster. We will also evaluate our ability to maintain this cluster. Second, we will use this cluster to install and try new versions of software and design libraries to ensure that they work and are stable before requesting ITS to install the new versions on SCF. This will minimize unpleasant surprises while allowing us to provide the latest versions of the tools to our students. Third, in August 2011, we will encourage all CEng faculty to use this mini-cluster for their classes for which it might be useful. This will also help us continue to evaluate our ability to maintain this cluster.

Other MHI sponsored activities