George Amvrosiadis Associate Research Professor, Affiliated Faculty CMU Scholars Page Office 2311 Mehrabian Collaborative Innovation Center Email gamvrosi@cmu.edu Department Electrical and Computer Engineering Computer Science Department: Affiliated Research Interests Systems Data-Intensive and Cloud Computing Distributed Systems Operating Systems Scientific Computing Advisees Nj Mukherjee Hojin Park Ziyue Qiu Biography I am a faculty member in Electrical & Computer Engineering, affiliated with the Computer Science Department, and a core member of the Parallel Data Lab. My students' research has been added to the Linux kernel, helped scientists run scientific simulations in one of the largest supercomputers in the world, has been featured in the Morning Paper, HackerNews, Wired, etc. Accolades we received include two R&D100 awards, a MLSys outstanding paper award, a IEEE/ACM Supercomputing best paper finalist award, and a ACM SIGMETRICS best paper award. I love hiking, biking, baking, analyzing all types of data, playing video games, and being a general handyperson. Research/Teaching Statement My group is currently doing research in the areas of distributed and cloud storage, new storage technologies, high performance computing, and storage for machine learning. Distributed and Cloud Storage. The cloud can always provide additional resources at a cost. We explore how we can rethink the way we design systems to be mindful of the performance vs cost tradeoff. One such example of traditional designs that do not apply to the cloud is caching, where we traditional allocation and eviction policies do not take into account that available cache space can be extended (or shrunk) if the cost benefit is worth it. New Storage Technologies. Exploring how new types of storage, such as Zoned Storage and Computational Storage, can be leveraged by different workloads. Designing and building systems that support these types of storage while remaining practical. High Performance Computing. Scientific simulations, such as those used in weather forecasting and particle physics, challenge the limits of how scalable a single parallel application can be. We design distributed systems that rethink the way metadata is persisted and communication is carried out in order to accommodate exabyte, million-core workloads. <b>Storage for Machine Learning.</b> We find that our inability to efficiently fetch and preprocess data before training models with it can significantly affect the training performance. We have been designing novel data formats and controlled-fidelity mechanisms to increase the effective bandwidth of the system. Publications Conference Mimir: Finding Cost-efficient Storage Configurations in the Public Cloud 2023 • PROCEEDINGS OF THE 16TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, SYSTOR 2023 • 22-34 Park H, Ganger GR, Amvrosiadis G Conference RAIZN: Redundant Array of Independent Zoned Namespaces 2023 • International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS • 660-673 Kim T, Jeon J, Arora N, Li H, Kaminsky M, Andersen DG, Ganger GR, Amvrosiadis G, Bjorling M Preprint Validating Large Language Models with ReLM 2022 Kuchnik M, Smith V, Amvrosiadis G Conference DeltaFS: A Scalable No-Ground -Truth Filesystem For Massively -Parallel Computing 2021 • International Conference for High Performance Computing, Networking, Storage and Analysis, SC Zheng Q, Cranor CD, Ganger GR, Gibson GA, Amvrosiadis G, Settlemyer BW, Grider GA Journal Article It's Time to Talk About HPC Storage: Perspectives on the Past and Future 2021 • Computing in Science and Engineering • 23(6):63-68 Settlemyer B, Amvrosiadis G, Carns P, Ross R
Conference Mimir: Finding Cost-efficient Storage Configurations in the Public Cloud 2023 • PROCEEDINGS OF THE 16TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, SYSTOR 2023 • 22-34 Park H, Ganger GR, Amvrosiadis G
Conference RAIZN: Redundant Array of Independent Zoned Namespaces 2023 • International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS • 660-673 Kim T, Jeon J, Arora N, Li H, Kaminsky M, Andersen DG, Ganger GR, Amvrosiadis G, Bjorling M
Conference DeltaFS: A Scalable No-Ground -Truth Filesystem For Massively -Parallel Computing 2021 • International Conference for High Performance Computing, Networking, Storage and Analysis, SC Zheng Q, Cranor CD, Ganger GR, Gibson GA, Amvrosiadis G, Settlemyer BW, Grider GA
Journal Article It's Time to Talk About HPC Storage: Perspectives on the Past and Future 2021 • Computing in Science and Engineering • 23(6):63-68 Settlemyer B, Amvrosiadis G, Carns P, Ross R