Machine Learning Infrastructure from a vSphere Infrastructure Perspective

For the last 18 months, I’ve been focusing on machine learning, especially how customers can successfully deploy machine learning infrastructure on vSphere infrastructure. This space is exciting as it has so many great angles to explore. Besides the model training, a lot of stuff happens with the data. Data is … [Read more...]

Deep Learning Technology Stack Overview for the vAdmin – Part 1

Introduction We are amid the AI “gold rush.” More organizations are looking to incorporate any form of machine learning (ML) or deep learning in their services to enhance customer experience, drive efficiencies in their processes or improve quality of life (healthcare, transportation, smart … [Read more...]

Multi-GPU and Distributed Deep Learning

More enterprises are incorporating machine learning (ML) into their operations, products, and services. Similar to other workloads, a hybrid-cloud model strategy is used for ML development and deployment. A common strategy is using the excellent toolset and training data offered by public cloud ML services for generic … [Read more...]

Machine Learning Workload and GPGPU NUMA Node Locality

In the previous article “PCIe Device NUMA Node Locality” I covered the physical connection between the processor and the PCIe device briefly touched upon machine learning workloads with regards to PCIe NUMA locality. This article zooms in on why it is important to consider PCIe NUMA … [Read more...]