At present, my country is implementing the project of "East and West". Under the topic of computing power development, let me talk about the thinking about "mathematics" and "arithmetic" of data centers. "Mathematics" here refers to the science of data, and "arithmetic" refers to the technology of computing power.
To separate: mainly to count or to save
From 2012 to 2019, the computing power demand of tech giant Google expanded 300,000 times in six years, doubling about every three and a half months. Why does it have such high computing requirements? It is artificial intelligence that drives the growth of computing power.
Take the artificial intelligence language analysis model GPT-3 released by OpenAI in 2020 as an example. It has 175 billion parameters, contains 45TB of data, and has a mathematical model size of 700GB. The supercomputer specially built by Microsoft for OpenAI has 285,000 CPUs and 10,000 GPUs for OpenAI to train all AI models. The cost of training is about $13 million. It can be seen that artificial intelligence modeling has high requirements on computing power.
At present, computing power can be divided into basic computing power (based on CPU chips), intelligent computing power (based on GPU and NPU chips), and super computing power (based on high-performance computers). The AI intelligent computing center based on GPU/NPU/FPGA is more suitable for training data and exporting models. After training the mathematical model, the model uses the input data to calculate the AI decision result, which does not require too high computing power. Therefore, CPU-based general-purpose computing is usually used to perform computing tasks under known mathematical models. This can be understood as the main function of the intelligent computing center is to calculate, and the main task of the data center is to store.
According to data from the China Academy of Information and Communications Technology, the global computing power distribution in 2021 will be the United States accounting for 31%, China accounting for 27%, followed by Japan, Germany, the United Kingdom and other countries. Among them, the basic computing power of the United States accounts for 35% of the world, the intelligent computing power accounts for 15%, and the supercomputing power accounts for 30%, while the three types of computing power in China account for 27%, 26% and 20% respectively.
It can be seen that the United States is mainly based on basic computing power, and China has surpassed the United States in terms of intelligent computing power. China's supercomputing and intelligent computing power centers are dominated by the government, the basic computing power is dominated by operators and Internet companies, and the United States is dominated by Internet companies.
In addition, China's three major telecom operators have made cloud computing capabilities and business deployment, and other operators in the world do not have such a layout, which is different from foreign countries.
To Watch: Cold vs Hot Data
From a data perspective, most of the data is either hot or cold. Hot data is mainly data that needs to be calculated in real time, while cold data does not need real-time performance. The country's eight computing power hubs, in fact, the west is mainly positioned to process cold data and some local hot data; while the east is mainly to process hot data.
According to the International Data Corporation IDC, 90% of the data in human history were generated in the past few years, and 50% of them were generated in the past two years. The recently generated data is hot data, but after a period of time, the hot data will "cool down" and become cold data. According to a statistic, cold, warm and hot data account for 80%, 15% and 5% of the cumulative data volume respectively, which means that cold data is the largest and most important.
The need for cold data is primarily storage. The computing power centers in the east and west are more suitable for hot data and cold data, respectively. In this sense, "calculating in the east and calculating in the west" can be said to be "storing in the east and in the west" - mainly storage, and of course, there is calculation.
In terms of computing architecture, there are two main types: separation of storage and computing, and in-memory computing.
The memory-computation separation architecture reads data from the memory under the instruction of the control unit and sends it to the CPU for calculation, and the obtained result is sent back to the memory. Such back-and-forth I/O communication is not efficient for the calculation of hot data.
However, the separation of storage and computing has an advantage. The storage unit not only serves a single computing unit, but also serves the computing of multiple servers at the same time, forming a pooled storage, which can support multi-cloud computing, achieve high utilization rate and low cost, Low energy consumption. And this just applies to cold data, for example, it can be modeled with a cloud platform, and trained and simulated with edge computing.
However, hot data needs to be calculated quickly, which is limited by the I/O bottleneck of the separation of storage and calculation, and the CPU capacity is affected by the difficulty of memory access speed, which requires in-memory calculation. In-memory technology replaces the hard disk with random access memory (RAM), in which all operations are performed. Now there are some new types of non-volatile memory such as resistive memory and phase change memory, which have made breakthroughs in the laboratory, but the cost of large-scale promotion is still relatively high. There are also modes between in-memory computing and memory-computing separation, such as near-memory computing.
In general, memory-compute separation is suitable for cold data processing, and in-memory computing is suitable for hot data. For example, the data of autonomous driving needs to be stored and calculated on the roadside or even in the car at the same time.
The west mainly deals with cold data, but also needs to deal with local hot data. Whether the hot and cold data needs to use different storage and computing architectures is also a question worthy of study.
To Clarify: PUE and IT Energy Efficiency
Data centers like to emphasize PUE these days. PUE is the ratio of the energy consumption of the data center to the energy consumption of the IT system, which reflects the level of the cooling system, but does not measure the energy efficiency of the IT system.
The indicator to measure the carbon use efficiency is CUE, which can directly reflect the carbon saving level of the data center. PUE and CUE are equivalent in terms of conventional electricity, but when using "green electricity", CUE can be low even if the data center PUE is high. Therefore, low PUE does not mean no energy consumption, because IT systems also consume energy.
According to statistics, among the energy consumption of IT systems in data centers, servers account for about 50%, storage systems account for about 35%, and network communication equipment account for about 15%. Data centers need to work 7×24 hours, but continuous work is not continuous computing. Generally speaking, many data centers do not occupy a high proportion of computing time, but the data also consumes energy when “sleeping”—the energy of the storage system at this time. consumption becomes the main body. So McCarthy reported that most of the power in the data center is used to maintain the servers, and the servers are only used for storage most of the time, and only 6% to 12% are used for computing. Therefore, it is very important to reduce the energy consumption of storage.
To reduce energy consumption, cold data storage should be considered first, and some people have suggested the use of tape instead of disk. It is estimated that if all 100PB of data are stored with hard disks, the 10-year storage cost will be $16.41 million; if 100% of the data is replaced by tape, the storage cost can be reduced by 73%.
At present, tape storage is being accepted and applied by more and more technology companies. For example, Baidu Intelligent Driving has fully started to use tape storage. Compared with the previous storage system, the overall cost has dropped by 85%.
But for hot data, people want the faster the better, and use flash memory instead of disk. It is not only fast and energy efficient, but the cost is still relatively high.
Another way to improve energy efficiency is data preprocessing. Not all data is useful, we need to remove some invalid values, such as spaces, missing data, outdated data, etc. In addition, the data can be optimized through the data compression algorithm, and the location and scheduling of the data storage can be reasonably arranged, so as to find out the location of the data storage more accurately, and it can also reduce the energy consumption.
Thoughts on "Counting in the East and Counting in the West"
"East and West" makes the layout of computing power facilities go beyond the scope of data center hubs. Although it is envisaged that the east and the west are paired with hot and cold data, how should the east and west be matched?
I noticed that in the planning of data centers in Guangdong Province, the computing power within the province accounts for 70% and the computing power outside the province accounts for 30%. This is different from the objective situation that cold data accounts for 80%. It stands to reason that most of the data outside the province is cold data, and cold data accounts for 80%, but the computing power outside the province is only 30%, which obviously cannot meet the needs. Or, can it be understood that 80% of cold data refers to storage capacity, not the proportion of computing power? This is a problem.
In addition, under the conditions of a market economy, the ratio of deposit and calculation between the east and the west should be "matched", but who cares about the matching of deposit and calculation between them? If they are left to design independently, how can they make the best use of capacity? Therefore, in the promotion process of "Eastern Data and Western Computing", it is necessary to further strengthen the coordination of computing power hubs and data centers in the east and west.
At the same time, there are many ratios within the same data center hub or cluster that need to be optimized. There are multiple data centers within a data center hub, and multiple owners within each data center. So, how to coordinate their supply of energy, land, electricity, etc.? How to establish a sharing mechanism to intensify the energy and network resources required by each data center in the hub and improve the utilization rate? There is currently no such mechanism. To this end, it is necessary to coordinate the cross-domain data center capabilities of "Eastern Data and Western Computing" to avoid mismatch of storage and computing resources. In short, "East and West Calculation" also requires "Knowing Numbers and Calculations".
In addition, each data center also needs to design a reasonable ratio of computing power, storage capacity, and network capacity, as well as the corresponding disaster recovery ratio. This is related to hot and cold data, large file and small file data, and cannot be "one size fits all".
In the long run, the larger the data center, the better the energy efficiency, but it should not be done in one step. Generally speaking, the CPU needs to be replaced in one and a half years. If it is built ahead of schedule, there will be waste. Gartner, an information technology research and analysis agency, believes that by 2025, 75% of data will be processed at the edge, and only 25% will be sent to cloud computing centers or data hubs. How to coordinate the computing power ratio between the edge and the central cloud is a proposition that needs to be carefully studied. Therefore, there is still a lot of content that needs to be studied in depth about the "mathematics" and "arithmetic" of data centers. Computing power is still relatively new to us, and we must be good at learning and innovating from practice.
(The author Wu Hequan is an academician of the Chinese Academy of Engineering. This article was organized by Zhao Guangli, a reporter from China Science News, based on his speech at the 2022 China Computing Conference)
Comments