One of the consequences of the continuing growth in HPC consumption is the increasing difficulty that organisations face in finding the highly-qualified staff they need to operate and maintain their systems. Recent studies by Gartner, IDC and others have consistently high-lighted a serious shortage of qualified candidates, or the experience of lengthy transitions for staff from other related HPC disciplines. A lack of knowledge or skilled computing staff also rose to fifth in the IDC top 10 driving factors of HPC deployment.
Traditional HPC management often relies on a dedicated in-house administration, with the responsibility to support users, monitor and troubleshoot faults, operate warranties, and maintain security and other operations. Day to day administration tasks often become simply reactive to changing circumstances, utilising manual processes to identify, diagnose and action the required response. Although successful IT services typically aim to boost productivity and reduce operational costs, many of today's cluster administration processes can be labour intensive and inefficient, costing businesses valuable time and resources each year.
As a specialist in the design, build and operational management of HPC systems, ClusterVision recognises that not all customers have the dedicated resources or skills required to maintain the high levels of cluster management needed by their user communities. For these customers, remote administration provides a professional and secure cluster management service, designed to enhance the overall experience of cluster ownership by either reducing or augmenting in-house administration. Through its RSA services ClusterVision is able to manage every aspect of its customers' cluster systems, from relatively simple monitoring to full scale operation, including the set-up and fulfilment of multiple user support environments. This ensures a high quality operational process and, with the pro-active programme of system health care, helps to both optimise performance and minimise user-downtime.
ClusterVision's RSA service is completely secure and minimally invasive to the everyday cluster operation. The standard remote access protocol is Secure Shell (SSH), which is easy to set up and can be quickly disabled when no longer required. Other connection options are also available to accommodate non-default remote access types, with most common environments being supported, including IPSec, OpenVPN compatible, Cisco Anyconnect and TeamViewer.
A base level of remote administration (CVS-RSA-BASE) is offered as a standard service for all of ClusterVision's new turn-key cluster implementations. This is used to perform standard 24/7 cluster status checks such as machine operation, interconnect subsystems, storage and application queuing. All relevant notes concerning fixes, changes, updates and upgrades are documented, managed and shared via an on-line change logbook. In the latest announced refinements to the RSA structure, base-level customers now also benefit from the automatic inclusion of the additional modules for environmental monitoring, management of failover and redundancy systems, network components, and the maintenance and support of parallel file-systems.
In addition to providing an underlying framework and a foundation for remote administration, the base level service also acts as a stacking point upon which to add other optional RSA packages. Additional services are arranged in modular work packages which are designed to be simple to engage with, and are easy to combine and scale depending on each customers own specific needs. These currently include administration for common workload and schedule management systems such as Torque, PBS Professional, SLURM, and LSF, and layers of system security and back-up, ensuring a quality process for audit and disaster recovery. For the user community, RSA can be used to set up and manage multiple user accounts, with additional options for a comprehensive provision of first and second level end-user support, both directly as a hot-line service and via the ClusterVision on-line Service Portal.
RSA service packages are available initially as a 1-year licence arrangement, with additional content or period extensions being available at any time. ClusterVision also offers the option of pre-paid Service Credits as flexible currency for customers to further extend their RSA and other service provisions as their circumstances change.
"We are acutely aware of the resource and skills constraints which many of our customers face in the day to day operation of their cluster systems, and believe that remote system administration offers a unique solution, either to cover a temporary period of resource shortfall, or as a cost-effective longer term outsourcing strategy. It is particularly pleasing that several major early adopters of these services are now returning to further expand their RSA environments", stated Christopher Huggins, Commercial Director at ClusterVision.