Introduction
In recent years, the proportion of cloud services in China's industries is growing. Technology companies have seized the opportunity of the new round of technological revolution, actively carried out digital transformation, increased the research and application of new technologies such as cloud computing, big data, artificial intelligence, blockchain and Internet of things, and improved their scientific and technological service capabilities. With the continuous development of cloud and virtualization technology, more and more application systems in data centers migrate from the original physical campus to the cloud platform, and the east-west traffic in the cloud environment of data centers is growing significantly. However, the traditional physical traffic collection network cannot directly collect the east-west traffic in the cloud environment, resulting in the business traffic in the cloud environment becoming the first area. It has become an inevitable trend to realize the data extraction of east-west traffic in the cloud environment. The introduction of new east-west traffic collection technology in the cloud environment makes the application system deployed in the cloud environment also have perfect monitoring support, and when problems and failures occur, packet capture analysis can be used to analyze the problem and track the data flow.
1. The cloud environment east-west traffic can not be directly collected, so that the application system in the cloud environment can not deploy monitoring detection based on real-time business data flow, and the operation and maintenance personnel can not timely discover the real operation of the application system in the cloud environment, which brings certain hidden benefits to the healthy and stable operation of the application system in the cloud environment.
2. The east and west traffic in cloud environment cannot be directly collected, which makes it impossible to directly extract data packets for analysis when problems occur in business applications in cloud environment, which brings certain difficulties to fault location.
3. With the increasingly stringent requirements of network security and various audits, such as BPC application transaction monitoring, IDS intrusion detection system, email and customer service recording audit system, the demand for east-west traffic collection in cloud environment is also becoming more and more urgent. Based on the above analysis, it has become an inevitable trend to realize the data extraction of east-west traffic in the cloud environment, and introduce a new east-west traffic collection technology in the cloud environment to make the application system deployed in the cloud environment can also have perfect monitoring support. When problems and failures occur, packet capture analysis can be used to analyze the problem and track the data flow. To realize the extraction and analysis of east-west traffic in cloud environment is a powerful magic weapon to ensure the stable operation of application systems deployed in cloud environment.
Key metrics for Virtual Network Traffic Capture
1. Network Traffic Capturing performance
The east-west traffic accounts for more than half of the data center traffic, and high performance acquisition technology is needed to realize the full collection. At the same time of acquisition, other preprocessing tasks such as deduplication, truncation, and desensitization need to be completed for different services, which further increases the performance requirements.
2. Resource Overhead
Most of the east-west traffic collection techniques need to occupy computing, storage and network resources that could be applied to the service. In addition to consuming these resources as little as possible, there is still a need to consider the overhead of implementing management of the acquisition technology. Especially when the scale of nodes expands, if the management cost also shows a linear upward trend.
3. Level of Intrusion
The current common acquisition technologies often need to add additional acquisition policy configuration on the hypervisor or related components. In addition to the potential conflicts with business policies, these policies often further increase the burden on the hypervisor or other business components and affect the service SLA.
From the above description, it can be seen that the traffic capture in cloud environment should focus on the capturing of east-west traffic between virtual machines and performance issues. At the same time, in view of the dynamic characteristics of the cloud platform, the traffic collection in the cloud environment needs to break through the existing mode of traditional switch mirror, and realize flexible and automatic collection and monitoring deployment, so as to match the automatic operation and maintenance goal of the cloud network. The traffic collection in the cloud environment needs to achieve the following goals:
1) Realize the capturing function of east-west traffic between virtual machines
2) The capturing is deployed to the computing node, and the distributed collection architecture is used to avoid the performance and stability problems caused by the switch mirror
3) It can dynamically sense the changes of virtual machine resources in the cloud environment, and the collection strategy can be adjusted automatically with the changes of virtual machine resources
4) The capturing tool should have an overload protection mechanism to minimize the impact on the server
5) The capturing tool itself has the function of traffic optimization
6) The capturing platform can monitor the collected virtual machine traffic
Selection of Virtual Machine Traffic Capturing Mode in Cloud Environment
The virtual machine traffic capture in cloud environment needs to deploy the collection probe to the computing node. According to the location of the collection point that can be deployed on the computing node, the virtual machine traffic capturing mode in cloud environment can be divided into three modes: Agent Mode, Virtual Machine Mode and Host Mode.
Virtual Machine Mode: a unified capturing virtual machine is installed on each physical host in the cloud environment, and a capturing soft probe is deployed on the capturing virtual machine. The traffic of the host is mirrored to the capturing virtual machine by mirroring the virtual network card traffic on the virtual switch, and then the capturing virtual machine is transmitted to the traditional physical traffic capture platform through a dedicated network card. And then distributed to each monitoring and analysis platform. The advantage is that softswitch bypass mirroring, which has no intrusion on the existing business network card and virtual machine, can also realize the perception of virtual machine changes and automatic migration of policies through certain means. The disadvantage is that it is impossible to achieve overload protection mechanism by capturing virtual machine passively receiving traffic, and the size of traffic that can be mirrored is determined by the performance of virtual switch, which has a certain impact on the stability of virtual switch. In KVM environment, the cloud platform needs to uniformly issue the image flow table, which is complex to manage and maintain. Especially when the host machine fails, the capturing virtual machine is the same as the business virtual machine and will also migrate to different hosts with other virtual machines.
Agent Mode: Install the capturing soft probe (Agent Agent) on each virtual machine that needs to capture traffic in the cloud environment, and extract the east and west traffic of the cloud environment through the Agent agent software, and distribute it to each analysis platform. The advantages are that it is independent of the virtualization platform, does not affect the performance of the virtual switch, can migrate with the virtual machine, and can perform traffic filtering. The disadvantages are that too many agents need to be managed, and the influence of the Agent itself cannot be excluded when the fault occurs. The existing production network card needs to be shared to spat traffic, which may affect the business interaction.
Host Mode: by deploying an independent collection soft probe on each physical host in the cloud environment, it works in process mode on the host, and transmits the captured traffic to the traditional physical traffic capturing platform. The advantages are complete bypass mechanism, no intrusion to virtual machine, business network card and virtual machine switch, simple capturing method, convenient management, no need to maintain independent virtual machine, lightweight and soft probe acquisition can achieve overload protection. As a host process, it can monitor the host and virtual machine resources and performance to guide the deployment of mirror strategy. The disadvantages are that it needs to consume a certain amount of host resources, and the performance impact needs to be paid attention to. In addition, some virtual platforms may not support the deployment of capturing software probes on the host.
From the current situation of the industry, the virtual machine mode has applications in the public cloud, and the Agent Mode and Host Mode have some users in the private cloud.
Post time: Nov-06-2024