Intelligent Computing System Lab

Introduction to research direction

The main research field of the ICSLab is distributed computing, which includes high-performance computing, energy-efficient computing and scalable computing. Specifically, our works mainly focus on a series of intelligent computing systems, including cloud-edge computing system, big data platform, distributed deep learning architecture, federated learning, graph computing, heterogeneous memory management, energy-efficient computing platform and so on.

Recent achievements

Congratulations to the Lab AI Systems Group on receiving 3 papers for the CCF Architecture Class A Conference SC2024 (102 papers in total)!
1. Scaling New Heights: Transformative Cross-GPU Sampling for Training Billion-Edge Graphs
2. MCFuser: High-Performance and Rapid Fusion of Memory-Bound Compute-Intensive Operators
3. Accelerating Distributed DLRM Training with Optimized TT Decomposition and Micro-Batching

The International Conference for High Performance Computing, Networking, Storage, and Analysis
Expeditious High-Concurrency MicroVM SnapStart in Persistent Memory with an Augmented Hypervisor
2024 USENIX Annual Technical Conference
Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences
IEEE Transactions on Computers
Controlling Aluminum Strip Thickness by Clustered Reinforcement Learning with Real-world Dataset
IEEE Transactions on Industrial Informatics
Incendio: Priority-based Scheduling for Alleviating Cold Start in Serverless Computing
IEEE Transactions on Computers
MPMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
IEEE Transactions on Parallel and Distributed Systems
A unified hybrid memory system for scalable deep learning and big data applications
Journal of Parallel and Distributed Computing
A Survey on Spatio-temporal Big Data Analytics Ecosystem: Resource Management, Processing Platform, and Applications
IEEE Transactions on Big Data
An Edge-side Real-time Video Analytics System with Dual Computing Resource Control
IEEE Transactions on Computers
TAPU: A Transmission-Analytics Processing Unit for Accelerating Multi-functions in IoT Gateways
IEEE Internet of Things Journal
Redundancy-Free High-Performance Dynamic GNN Training with Hierarchical Pipeline Parallelism
International Symposium on High-Performance Parallel and Distributed Computing(ACM HPDC23).
MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2023.
DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning
IEEE Transactions on Cloud Computing, 2023.
Spread: Decentralized Model Aggregation for Scalable Federated Learning
Proceedings of the 51st International Conference on Parallel Processing, 2022.

Cooperative companies

Laboratory members

Professors

Dazhao Cheng

Dazhao Cheng, professor and doctoral advisor, is currently the vice dean of the School of Computer Science, Wuhan University. Recently, he is committed to the research of new memory computing architecture for artificial intelligence loads, and has rich research foundation and experience in computer system architecture. He has been selected as a special professor in Hubei Province's "100 People Plan" and an innovative talent in Wuhan. At the same time, he is the deputy dean of the School of Computer Science of Wuhan University, the standing member of CCF Special Committee on Distributed Computing and Systems, and the chief scientist of key projects of the National Key Research and development Program and Foundation Committee. Tenure Track as an Assistant Professor in the Department of Computer Science at the University of North Carolina from 2016 to 2020. He has served as a review expert for the National Ministry of Science and Technology, key projects of the Guangdong and Hubei Provincial Departments of Science and Technology, and National Science Foundations of the United States and Canada. He has published 50 papers and applied for more than 10 patents in authoritative journals and conferences in the field of computer systems, of which more than 30 have been published by the first/corresponding author (including more than 20 CCF class A papers). He is mainly engaged in the research of storage and computing integration, distributed computing and other aspects, and has presided over one national key research and development program, one key project of the National Natural Science Foundation of China, two NSF projects in the United States and several other provincial and ministerial projects, with a total funding of about 17 million yuan. At the same time, he was invited to serve in a number of international and domestic academic organizations, serving as guest editor of 4 academic journals, chairman of 5 international conferences, and member of the technical committee of 28 international conferences.
• E-mail: dcheng@whu.edu.cn
Google Scholar, CV

Yili Gong

Yili Gong, associate professor, master's supervisor. He graduated from the Computer Science Department of Wuhan University in 1998. In February 2006, the Institute of Computing of Chinese Academy of Sciences obtained a doctor's degree in computer architecture. From 2006 to 2007, he worked in Community Grids Lab, Professor Geoffrey Fox of Indiana University, and has worked in the School of Computer Science of Wuhan University since 2008. From 2014 to 2015, he worked as a visiting scholar in Ann Arbor Campus of University of Michigan for one year, and his co tutor was Sugih Jamin. I am very interested in the research of various distributed systems. At present, the main work fields are intelligent operation and maintenance in HPC environment, distributed file system and blockchain system. Representative works (first/corresponding author): Computer Journal 2016, HPCC 2015, ASAP 2017, ICPADS 2016, ICPP 2019, etc.
• E-mail: yiligong@whu.edu.cn

Chuang Hu

Chuang Hu, associate researcher, master's supervisor. In 2019, he graduated from the Department of Computer Science of Hong Kong Polytechnic University with a doctor's degree, and continued to serve as a post doctoral and research assistant professor in Hong Kong Polytechnic University. The main research directions are edge learning, Internet of Things and federal analysis. Presided over the project of an adaptive accelerator side acceleration platform supporting real-time video analysis for about 420,000 yuan. Representative works (first author): IEEE Network 2021, ACM TOSN 2018, INFOCOM 2018, 2019.
• E-mail: hchuchuang@gmail.com

PhD Students

Xiaoming Han

Research direction:
Deep learning framework optimization, parallel computing, distributed computing
• E-mail: wsdhrchen@163.com

Huanghuang Liang

Research direction:
Big data platform, distributed computing system.
• E-mail: hhliang@whu.edu.cn

Xinquan Cai

Research direction:
Serverless platform task scheduling, hot and cold start.
• E-mail: xinquancai@whu.edu.cn

Yaqi Xia

Research direction:
Graph neural network, 3D point cloud
• E-mail: yaqixia@whu.edu.cn

Zhili He

Research direction:
Deep learning
• E-mail: hezhili@alcswa.com

Qianlong Sang

Research direction:
Linux kernel, cloud edge system scheduling
• E-mail: 2017301110036@whu.edu.cn
• Personal homepage

Haoran Zhou

Research direction:
Deep learning, distributed computing, memory management
• E-mail: zhrzhr@whu.edu.cn

Master Students

Boan Liu

Research direction:
Deep learning framework optimization, parallel computing, distributed computing
• E-mail: boanliu@whu.edu.cn

Hulin Wang

Research direction:
GPU virtualization, cloud edge computing
• E-mail: 861617573@qq.com

Zheng Zhang

Research direction:
Deep learning distributed training acceleration, distributed training framework.
• E-mail: 303187311@qq.com

Yuqing Zhang

Research direction:
Graph computing
• E-mail: 2018302110091@whu.edu.cn

Yan Gong

Research direction:
Federated Learning, Game Theory
• E-mail: 1046957180@qq.com

Tianyu Tu

Research direction:
Federated Learning, Federated Analysis
• E-mail: meetnailtu30@gmail.com

Nanxi Wu

Research direction:
Federated learning
• E-mail: nancywu@whu.edu.cn

Xiaoyue Yang

Research direction:

• E-mail: 1768389167@qq.com

Hanqi Feng

Research direction:

• E-mail: 528849400@qq.com

Hengjie Cai

Research direction:

• E-mail: 2017312580122@whu.edu.cn

Ziqi Xiao

Research direction:
Industrial Big Data Architecture
• E-mail: ziqixiao@whu.edu.cn

Graduates

Sen Wei

Master graduate


Undergraduates

Runpeng Geng

Research direction:
Cuda programming
• E-mail: kevingeng@whu.edu.cn

Jinrong Yang

Research direction:
Linux architecture, virtualization technology
• E-mail: coyangjr@whu.edu.cn

Research direction

Ⅰ. High performance computing

ⅰ. High performance cloud computing
Cloud computing divides a single problem into multiple parts. Each part is solved by different computers. As long as computers are connected to the Internet, they can communicate with each other to exchange a large amount of data to solve the problem. We further propose and develop a new memory distributed computing framework aiming at improving resource utilization of multi-user big data clusters. Relevant achievements were published in IEEE TC'17, IEEE TPDS'18, IEEE INFOCOM'17.

ⅱ. High performance edge computing
Edge computing refers to processing data at the edge of the network, which can reduce request response time, extend the use of batteries, reduce network bandwidth and ensure data security and privacy. We designed a multi-task transfer learning task optimization system based on task importance in order to improve the efficiency of supporting transfer learning at the edge. Relevant achievements were published in IEEE Network'21, IEEE TPDS'20, ICDCS'22.

ⅲ. Big data platform
With the advent of the big data era, the data volume is growing constantly. The traditional stand-alone model is so difficult and expensive to expand that it is hard to support business development. The optimization of the big data platform is particularly important. We propose a Hadoop RDS scheduler based on dynamic resource awareness to address the shortcomings of existing schedulers in dynamic Hadoop big data clusters. Relevant achievements were published in IEEE IPDPS'15, IEEE IPDPS'18, IEEE ICDCS'15, IEEE TPDS'17, IEEE TPDS'18.


Ⅱ. Energy-efficient computing

ⅰ. Energy-efficient edge computing
At the edge of the user, it is bound to be a resource constrained environment. At this time, the use of energy-efficient consumption is critical to the performance improvement of the entire system. The system can't even run without energy-efficient design. Sang Qianlong, a student of us, is now researching the dynamic frequency modulation technology of the mobile terminal in order to realize the energy-efficient design of the mobile terminal.

ⅱ. Energy-efficient cloud computing
The design of energy-saving big data cluster has attracted extensive attention in recent years. We further propose a task allocation method E-Ant based on heterogeneous awareness, aiming to reduce the total energy consumption of heterogeneous Hadoop clusters as much as possible. This method can adaptively schedule heterogeneous workloads without knowing the workload attributes in advance. Relevant achievements were published in IEEE ICDCS'15, IEEE TPDS'18, IEEE MASCOTS'13, ACM TAAS'15.

ⅲ. Green data center
In recent years, under the wave of cloud computing industry development, building green data centers and achieving energy conservation and emission reduction have become one of the hot topics of the industry. In this study, we propose a green data center elastic resource allocation scheme based on energy awareness. Relevant achievements were published in IEEE TC'16, IEEE TPDS'18.


Ⅲ. Scalable computing

ⅰ. Cloud edge combined computing
The relationship between edge computing and cloud computing is not a substitute, but a complementary and collaborative relationship. Edge computing and cloud computing need to be closely coordinated to better meet the matching of various demand scenarios, thereby enlarging the application value of edge computing and cloud computing. We propose a video analysis architecture based on edge-cloud combination to significantly reduce video analysis latency. Relevant achievements were published in IEEE Network'21, IEEE TPDS'20.

ⅱ. Distributed AI
A large amount of training data is required to improve the quality of prediction and make machine learning solutions feasible in more complex applications, thus machine learning workload needs to be distributed to multiple machines. We designed a scheduler for distributed GPU cluster, which can efficiently schedule and appropriately place deep learning tasks to reduce their task completion time. Relevant achievements were published in ACM/IFIP/USENIX 2020, IEEE Big Data'19, ACM PPoPP'20, IEEE Cluster'20.

ⅲ. Federated Learning/Analysis
Federated analysis is a distributed computing paradigm proposed by Google. It performs analysis tasks together without disclosing the local data of edge devices. We design a distributed learning framework for federated anomaly analysis to actively defend against local model poisoning attacks. Relevant achievements were published in ICDCS'19, JSAC'22.

A unique feature of federated learning is that edge devices belong to individuals. When facing a large number of edge devices, centralized model aggregation becomes a bottleneck, which fundamentally limits the scalability of the system. We have designed an extensible federation system based on Spread: it adopts an adaptive algorithm for cluster construction to adjust the model training between clusters and within clusters at runtime. Relevant achievements were published in ICPP'22.

Scientific research achievements

Journal Papers

[1] X Wei, ABMM Rahman, D Cheng, Y Wang: Joint Optimization across Timescales: Resource Placement and Task Dispatching in Edge Clouds. IEEE Transactions on Cloud Computing (TCC), 2021.

View PDF

[2] T Li; Z Qiu; D Cheng; W Wang; X Shi; Y Wang*: Privacy-Preserving Participant Grouping for Mobile Social Sensing over Edge Clouds. IEEE Transactions on Network Science and Engineering (TNSE), 2021.

View PDF

[3] D Cheng; Y Wang; D Dai*: Dynamic Resource Provisioning for Iterative Workloads on Apache Spark. IEEE Transactions on Cloud Computing (TCC), 2021.

View PDF

[4] W Rang; D Yang; D Cheng*: Dependency-aware Tensor Scheduler for Industrial AI Applications. IEEE Industrial Electronics Magazine (IEM), 2021.

View PDF

[5] W Rang; D Yang; D Cheng*: Yu Wang; Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2021.

View PDF

[6] D Yang, D Cheng*, W Rang, Y Wang: Joint Optimization of MapReduce Scheduling and Network Policy in Hierarchical Data Centers. IEEE Transactions on Cloud Computing (TCC), 2019.

View PDF

[7] D Cheng*, X Zhou, Y Xu, L Liu, C Jiang: Deadline-aware MapReduce job scheduling with dynamic resource availability. IEEE transactions on parallel and distributed systems (TPDS), 2018:30 (4), 814-826.

View PDF

[8] D Cheng*, X Zhou, Z Ding, Y Wang, M Ji: Heterogeneity aware workload management in distributed sustainable datacenters. IEEE Transactions on Parallel and Distributed Systems (TPDS),2018:30 (2), 375-387.

View PDF

[9] D Cheng*, X Zhou, Y Wang, C Jiang: Adaptive scheduling parallel jobs with dynamic batching in spark streaming. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2018:29 (12), 2672-2685.

View PDF

[10] D Cheng*, X Zhou, P Lama, M Ji, C Jiang: Energy efficiency aware task assignment with dvfs in heterogeneous hadoop clusters. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2017:29 (1), 70-82.

View PDF

[11] D Cheng, X Zhou*, P Lama, J Wu, C Jiang: Cross-platform resource scheduling for spark and mapreduce on yarn. IEEE Transactions on Computers (TC). 2017:66 (8), 1341-1353.

View PDF

[12] D Cheng, J Rao, Y Guo, C Jiang, X Zhou*: Improving performance of heterogeneous mapreduce clusters with adaptive task tuning. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016:28 (3), 774-786.

View PDF

[13] Y Guo, J Rao, D Cheng, X Zhou*: ishuffle: Improving hadoop performance with shuffle-on-write. IEEE transactions on parallel and distributed systems (TPDS), 2016:28 (6), 1649-1662.

View PDF

[14] D Cheng, J Rao, C Jiang, X Zhou*: Elastic power-aware resource provisioning of heterogeneous workloads in self-sustainable datacenters. IEEE Transactions on Computers (TC). 2015:65 (2), 508-521.

View PDF

[15] D Cheng, Y Guo, C Jiang, X Zhou*: Self-tuning batching with dvfs for performance improvement and energy efficiency in internet servers. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2015:10 (1), 1-32.

View PDF

[16] Yili Gong; Chuang Hu; Yanyan Xu; Wenjie Wang; A distributed file system with variable si zed objects for enhanced random writes, Computer Journal, 2016, 59(10): 1536-1550

View PDF

[17] Chuang Hu, Rui Lu, and Dan Wang, “FEVA: A FEderated Video Analytics Architecture for Networked Smart Cameras”, IEEE Network, Volume 35, Issue 6, Pages 163-170, December, 2021.

View PDF

[18] Siping Shi, Chuang Hu, Dan Wang, and Yifei Zhu, and Zhu Han, “Federated Anomaly Analytics for Local Model Poisoning Attack”, IEEE Journal on Selected Areas in Communications (JSAC), Volume 40, Issue 2, Pages 596-610, February, 2022.

View PDF

[19] Chuang Hu, Wei Bao, Dan Wang, Yi Qian, Muqiao Zheng and Shi Wang, “sTube+: An IoT Communication Sharing Architecture for Smart After-sales Maintenance in Buildings”, ACM Transactions on Sensor Networks (TOSN), Volume 14, Issue 3-4, Pages 1-29, December, 2018.

View PDF

[20] Qiong Chen, Zimu Zheng, Chuang Hu, Dan Wang, and Fangming Liu,“On-Edge Multi-Task Transfer Learning: Model and Practice With Data-Driven Task Allocation”, IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 31, Number 6, Pages 1357-1371, June, 2020.

View PDF

[21] Dan Wang, Wei Bao, Chuang Hu, Yi Qian, Muqiao Zheng and Shi Wang, “sTube: An Architecture for IoT Communication Sharing”, IEEE Wireless Communications Magazine, Volume 56 , Issue 7, Pages 96-101, July, 2018.

View PDF

[22] Yili Gong, Chuang Hu, Yanyan Xu, et al., “A Distributed File System with Variable Sized Objects for Enhanced Random Writes”, The Computer Journal, Volume 59, Number 10, Pages 1536-1550, 2016.

View PDF



Conference Papers

[1] W Rang, D Yang, Z Li, D Cheng: Scalable Data Management on Hybrid Memory System for Deep Neural Network Applications. 2021 IEEE International Conference on Big Data (Big Data), 1470-1480

View PDF

[2] K Suo, J Son, D Cheng, W Chen, S Baidya: Tackling Cold Start of Serverless Applications by Efficient and Adaptive Container Runtime Reusing. (CLUSTER 2021): 433-443

View PDF

[3] W Rang, D Yang, D Cheng: A Shared Memory Cache Layer across Multiple Executors in Apache Spark.2020 IEEE International Conference on Big Data (Big Data), 477-482.

View PDF

[4] D Yang, W Rang, D Cheng: Mitigating Stragglers in the Decentralized Training on Heterogeneous Clusters.Proceedings of the 21st International Middleware Conference (Middleware), 386-399.

View PDF

[5] W Rang, D Yang, D Cheng*, K Suo, W Chen: Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning.2020 IEEE International Conference on Cluster Computing (CLUSTER), 392-398.

View PDF

[6] K Suo, Y Shi, X Xu, D Cheng, W Chen:Tackling Cold Start in Serverless Computing with Container Runtime Reusing.Proceedings of the Workshop on Network Application Integration/CoDesign, 54-55.

View PDF

[7] D Yang, D Cheng: Efficient gpu memory management for nonlinear dnns. Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 185-196.

View PDF

[8] J Tian, S Di, C Zhang, X Liang, S Jin, D Cheng, D Tao*, F Cappello: Wavesz: A hardware-algorithm co-design of efficient lossy compression for scientific data. Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 74-88.

View PDF

[9] D Yang, W Rang, D Cheng, Y Wang, J Tian, D Tao:Elastic Executor Provisioning for Iterative Workloads on Apache Spark. 2019 IEEE International Conference on Big Data (Big Data), 413-422.

View PDF

[10] TBG Perez, X Zhou, D Cheng: Reference-distance eviction and prefetching for cache management in spark. Proceedings of the 47th International Conference on Parallel Processing (ICPP), 1-10 12.

View PDF

[11] D Yang, W Rang, D Cheng: Joint optimization of mapreduce scheduling and network policy in hierarchical clouds. Proceedings of the 47th International Conference on Parallel Processing (ICPP), 1-10.

View PDF

[12] P Lama, S Wang, X Zhou, D Cheng: Performance isolation of data-intensive scale-out applications in a multi-tenant cloud. 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 85-94.

View PDF

[13] D Cheng, Y Chen, X Zhou, D Gmach, D Milojicic: Adaptive scheduling of parallel jobs in spark streaming. IEEE INFOCOM 2017-IEEE Conference on Computer Communicationsc(INFOCOM), 1-9.

View PDF

[14] D Cheng, P Lama, C Jiang, X Zhou: Towards energy efficiency in heterogeneous hadoop clusters by adaptive task assignment. 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS). 359-368.

View PDF

[15] D Cheng, J Rao, C Jiang, X Zhou: Resource and deadline-aware job scheduling in dynamic hadoop clusters. 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 956-965.

View PDF

[16] Y Guo, J Rao, D Cheng, C Jiang, CZ Xu, X Zhou: Storeapp: A shared storage appliance for efficient and scalable virtualized hadoop clusters. 2015 IEEE Conference on Computer Communications (INFOCOM), 594-602.

View PDF

[17] D Cheng, J Rao, Y Guo, X Zhou: Improving mapreduce performance in heterogeneous environments with adaptive task tuning. Proceedings of the 15th International Middleware Conference (Middleware), 97-108.

View PDF

[18] D Cheng, C Jiang, X Zhou: Heterogeneity-aware workload placement and migration in distributed sustainable datacenters. 2014 IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS), 307-316.

View PDF

[19] D Cheng, Y Guo, X Zhou: Self-tuning batching with dvfs for improving performance and energy efficiency in servers. 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS),40-49.

View PDF

[20] Yili Gong; Yanyan Xu; Yingchun Lei; Wenjie Wang ; VarFS: A variable-sized objects based d istributed file system, 2015 IEEE 17th International Conference on High Performance Computing and Communications, New York, NY, United states

View PDF

[21] Yili Gong; Jia Tang; Wenhai Li; Zihui Ye ; Massive spatial query on the Kepler architectu re, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Pr ocessors (ASAP), Seattle, WA, USA

View PDF

[22] Yili Gong; Chuang Hu; Wentao Ma; Wenjie Wang ; CC-Paxos: Integrating Consistency and Reli ability in Wide-Area Storage Systems, 22nd IEEE International Conference on Parallel and Distribu ted Systems (ICPADS 2016), Wuhan, China

View PDF

[23] Caixin Gong; Shuibing He; Yili Gong; Yingchun Lei ; On Integration of Appends and Merges in Log-Structured Merge Trees, Proceedings of the 48th International Conference on Parallel Proce ssing - ICPP 2019, Kyoto, Japan

View PDF

[24] Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu, “Dynamic Adaptive DNN Surgery for Inference Acceleration on the Edge”, Proceedings of the 38th IEEE International Conference on Computer Communication (INFOCOM’19), Paris, France, April 29 - May 2, 2019. [CCF-A]

View PDF

[25] Chuang Hu, Wei Bao, and Dan Wang, “IoT Communication Sharing: Scenarios, Algorithms and Implementation”, Proceedings of the 37th IEEE International Conference on Computer Communication (INFOCOM’18), Honolulu, HI, October 24-28, 2018. [CCF-A]

View PDF

[26] Chuang Hu, Huanghuang Liang, Xiaoming Han, Boan Liu, Dazhao Cheng, and Dan Wang, “Spread: Decentralized Model Aggregation for Scalable Federated Learning”, 51st International Conference on Parallel Processing (ICPP’22), Bordeaux, France, August 29- September 1, 2022. [CCF-B]

[27] Rui Lu, Chuang Hu*, Dan Wang, and Jin Zhang, “Gemini: a Real-time Video Analytics System with Dual Computing Resource Control”, The Seventh ACM/IEEE Symposium on Edge Computing (SEC’22), Seattle, WA, December 5-8, 2022.

[28] Siping Shi, Chuang Hu, Dan Wang, and Yifei Zhu, and Zhu Han, “Distributionally Robust Federated Learning for Differentially Private Data”, 42nd IEEE International Conference on Distributed Computing Systems (ICDCS’22), Bologna, Italy, July 10-13, 2022. [CCF-B]

[29] Chuang Hu, Wei Bao, Dan Wang, Yi Qian, Muqiao Zheng and Shi Wang, “sTube+: An IoT Communication Sharing Architecture for Smart After-sales Maintenance in Buildings”, Proceedings of the 4th ACM International Conference on Systems for Energy-Efficient Build Environments (BuildSys’17), Delft, The Netherlands, November 8-9, 2017.

View PDF

[30] Junkun Peng, Qing Li, Xiaoteng Ma, Yong Jiang, Yutao Dong, Chuang Hu and Meng Chen, “MagNet: Cooperative Edge Caching by Automatic Content Congregating”, Proceedings of the ACMWeb Conference (WWW’22), accepted, 2022. [CCF-A]

View PDF

[31] Qiong Chen, Zimu Zheng, Chuang Hu, Dan Wang, and Fangming Liu, “Data-driven Task Allocation for Multi-task Transfer Learning on the Edge”, Proceedings ofthe IEEE International Conference on Distributed Computing Systems (ICDCS’19), Dallas, Texas, July 7-9, 2019. [CCF-B]

View PDF

[32] Zimu Zheng, Chuang Hu, and Dan Wang, “Time-aware Chiller Sequencing Control with Data-driven Chiller Performance Profiling (Poster)”, Proceedings of the 4th ACM International Conference on Systems for Energy-Efficient Build Environments (BuildSys’17), Delft, The Netherlands, November 8-9, 2017.

View PDF

[33] Yili Gong, Chuang Hu , Wentao Ma, et al., “CC-Paxos: Integrating Consistency and Reliability in Wide-Area Storage Systems”, in Proceedings of IEEE International Conference on Parallel and Distributed Systems (ICPADS’16) , Wuhan, China, December 2016. [CCF-C]

View PDF

Experimental platform

Cluster servers

• Our laboratory is equipped with 16 cluster servers and 1 memory server, which are connected through gigabit network. It is mainly used for laboratory students to share for experiments or cluster level experiments.

  • Server 1 *16
  • CPU Intel i9-10900X *1
  • Memory Kingston 16GB *4
  • GPU RTX3080 10G
  • Memory Server * 1
  • CPU XEON 6226R
  • Memory 32G 2933 *4 + 128G Intel Optane Persistent Memory 100 Series *4

Blade Servers

• Our laboratory has placed some blade servers in the computer room of the computer college, mainly for virtualization experiments and deep learning calculations. The GPU cards of the A100 server are connected through NVLink.

  • A100 Server *1
  • CPU Intel(R) Xeon(R) Gold 6240C CPU *2
  • Memory DDR 32GB *8
  • GPU NVIDIA A100 40GB *4

Supercomputing center

• In addition to the resources provided by the laboratory, you can also use the resources of the Supercomputing Center of Wuhan University to carry out experimental research.

  • Main performance indicators of supercomputing center
  • CPU cluster 10176 CPU cores, peak computing capacity 350 trillion times per second
  • KNL cluster 11424 CPU cores, peak computing capacity 500 trillion times per second, 100G OPA interconnection
  • GPU cluster 500 Nvidia Tesla V100 16GB, peak computing capacity 3750 trillion times per second, 100G OPA interconnection
  • Storage 30 IO nodes, Lustre parallel file system, 3PB
  • Computing Network 56Gbps FDR InfiniBand full line speed high-speed network, 100G OPA interconnection