Problem: As we discussed yesterday, we'd like to prepare for some specific technical discussion for the next interview. As I explained during our conversation yesterday, it's not unusual for engineers here to need to come up to speed on unfamiliar code bases as an integral part of investigating a customer issue.
With that in mind, as the role you're applying for has a heavy OpenStack component, we will ask you to please prepare to discuss aspects of the neutron openvswitch agent.
We would like you to study and be able to describe the code path in the
OVSNeutronAgent.rpc_loop
method in neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
.
We will discuss the overall flow of the code path executed. For this exercise, you can assume that the version of the Neutron code is the Rocky release of OpenStack, available through:
git clone -b stable/rocky https://github.com/openstack/neutron
This won't be the only topic for the interview; we're using this as an opportunity to assess your ability to come up to speed with unfamiliar code.
As this will be a deep technical screen, please feel free during the next interview to explore the questions we ask in depth, as long as you feel the additional details you provide contribute to giving a technically precise answer.
Things to understand:
- Neutron component of OpenStack
- OpenVSwitch driver
- Plugin implementation architecture
- Code
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
aroundrpc_loop
method.
-
What is Neutron and its Architecture.
-
Layer 2 and Layer 3 networking concepts.
- https://docs.openstack.org/arch-design/design-networking/design-networking-concepts.html
- For dynamic machines(moving around data center, layer 2 is efficient, no overhead of IP based routing)
- Troubleshooting is hard.
- Layer 3 has same scalablity as internet, easy to manage trafic, QOS etc.
- No builtin isolation mechanism like VLAN. So moving a machine out of subnet is difficult and require IP encapsulation software
Ethernet frames contain all the essentials for networking. These include, but are not limited to, globally unique source addresses, globally unique destination addresses, and error control. - Ethernet frames can carry any kind of packet. Networking at layer-2 is independent of the layer-3 protocol. - Adding more layers to the Ethernet frame only slows the networking process down. This is known as nodal processing delay. - You can add adjunct networking features, for example class of service (CoS) or multicasting, to Ethernet as readily as IP networks. - VLANs are an easy mechanism for isolating networks. Although it is not a substitute for IP networking, networking at layer-2 can be a powerful adjunct to IP networking. Layer-2 Ethernet usage has additional benefits over layer-3 IP network usage: - Speed - Reduced overhead of the IP hierarchy. - No need to keep track of address configuration as systems move around.
-
VLAN
-
GRE
-
VXLAN
-
How network packets flows?
- Between two VM's within hosts
- Between two VM's in different hosts?
-
The instance 1 tap interface (1) forwards the packet to the Linux bridge qbr. The packet contains destination MAC address I2 because the destination resides on the same network.
-
Security group rules (2) on the Linux bridge qbr handle firewalling and state tracking for the packet.
-
The Linux bridge qbr forwards the packet to the Open vSwitch integration bridge br-int.
-
The Open vSwitch integration bridge br-int adds the internal tag for the provider network.
-
The Open vSwitch integration bridge br-int forwards the packet to the Open vSwitch provider bridge br-provider.
-
The Open vSwitch provider bridge br-provider replaces the internal tag with the actual VLAN tag (segmentation ID) of the provider network.
-
The Open vSwitch VLAN bridge br-provider forwards the packet to the physical network infrastructure via the provider network interface.
-
Role of l2 plugin - ovs_neutron_agent
- Execute actual networking commands according to Neutron Server commands.
- These are Implemented as RPC plugins.
- ovs-neutron_agent poll on configured time, and look for changes and try to apply same.
-
Configuration
-
https://docs.openstack.org/ocata/config-reference/networking/samples/openvswitch_agent.ini.html
-
https://docs.openstack.org/ocata/config-reference/networking/networking_options_reference.html
-
Most of variables and members of ovs_neutron_agent in rpc_loop() are controlled with config files.
-
-
rpc_loop()
-
Invoked through
main() -> daemon_loop() -> rpc_loop() register | signal-handling | poll-wait and act on state
Check OVS state and act accordingly.
get OVS Status:
case: OVS_RESTARTED
- Setup integration bridge,
- Setup physical bridge
- if enable_tunneling is true (config)
- setup tunnel bridges (link tunnel bridge with integration bridge using patch port)
- update l2population to avoid race condition ?
- if enable_distributed_routing is setup
- setup dvr flows.
- notify ovs restart.
case: OVS_DEAD
- No action on dead ovs.
- Continue polling and check status.
if bridge_monitor is set
- check if new bridge is added.
- if yes, setup the physical_bridge
- Here physical bridge is created and linked to integration bridge.
- Here get bridge mapping i.e.
physical bridge , ovs bridge
pair. - if ovs_bridge not in list of OVS_BRIDGES terminate the agent. (Why not to delete or correct the state?)
- Since bridge is aleady exist (thats why control is here), it will just update datapath_id etc.
- Ensure, datapath_id is unique.
- Setup all mode, flow control etc for bridge.
- put bridge in mapping of physical_network to bridge mapping.
- Interface type of port for physical and integration bridges must be same
- Check
patch
orveth
type and setup accordingly.- Networking details.
- if its recreated, then Sync is set to true.
if tunnel_sync is true, configure tunnel endpoints with other ovs agents.
The real work of agent start here.. L #2158 i.e. The agent has some updates or sync is required
-
Save copy of update_ports and activated_bindings in case rollback is required.
-
Get port infomation
-
Since it is sync operation, reset Sync=false ( for next loop)
-
self.process_deleted_ports(port_info)
-
self.process_deactivated_bindings(port_info)
-
self.process_activated_bindings(port_info, activated_bindings_copy)
-
Check for changes in VIF ofport rules or stale ofport.
- update the changes.
- Store the current chnages in vifname_to_ofport_map for next loop
-
In case of any chnages in state, update the information on neutron server too.
-
same goes for ancillary_brs bridges. ( external bridge)
-
mark the polling loop compete.
In case of any error i.e. exception, retore old value of update port and wait for next poll wakeup.
making it public