patchpanel: Generate neighbor failed/recovered events
Currently, we have a "connected" state for each neighbor monitored by
NeighborLinkMonitor, which is determined by whether the NUD state is in
NUD_VALID or not, and send D-Bus signals when that state changes. This
logic has the following issues:
- We don't really need a "connected" state changed event, what we need
is the event of link failed and recovered. One difference between them
is that when the link is going down, it will be disconnected but there
may not be a failure. We don't need to emit metrics or performing the
WiFi reattach in that case.
- The current logic for this state could be problematic. NUD_VALID means
the entry in the kernel neighbor table has a layer 2 address, but not
bidirectionally reachable. For example, when the device receives a ARP
request from an unknown neighbor, it will create an entry with state
of NUD_STALE, it's valid but actually the reachability is not
confirmed. We should only rely on NUD_REACHABLE and NUD_FAILED to
defer its reachability state.
So this patch does the following changes:
- Change the signal name from "NeighborConnectedStateChanged" to
"NeighborReachablilityEvent", which contains an event type of FAILED
or RECOVERED;
- Change the logic for generating events to rely on NUD_REACHABLE and
NUD_FAILED, instead of NUD_VALID.
BUG=b:162194516
TEST=Manually set the NUD state of gateway to NUD_FAILED, observed the
FAILED log on the shill side, and another RECOVERED log after some
time.
Change-Id: I580639917ed4bcf7ffe36e5128566cde4282ae8c
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform2/+/2532026
Reviewed-by: Hugo Benichi <hugobenichi@google.com>
Tested-by: Jie Jiang <jiejiang@chromium.org>
Commit-Queue: Jie Jiang <jiejiang@chromium.org>
diff --git a/patchpanel/manager.cc b/patchpanel/manager.cc
index 8562316..ac5dc06 100644
--- a/patchpanel/manager.cc
+++ b/patchpanel/manager.cc
@@ -238,7 +238,7 @@
datapath_.get(), forwarder);
network_monitor_svc_ = std::make_unique<NetworkMonitorService>(
shill_client_.get(),
- base::BindRepeating(&Manager::OnNeighborConnectedStateChanged,
+ base::BindRepeating(&Manager::OnNeighborReachabilityEvent,
weak_factory_.GetWeakPtr()));
network_monitor_svc_->Start();
@@ -789,20 +789,21 @@
return dbus_response;
}
-void Manager::OnNeighborConnectedStateChanged(
+void Manager::OnNeighborReachabilityEvent(
int ifindex,
const shill::IPAddress& ip_addr,
NeighborLinkMonitor::NeighborRole role,
- bool connected) {
+ NeighborReachabilityEventSignal::EventType event_type) {
if (!ip_addr.IsValid()) {
LOG(DFATAL) << "ip_addr is not valid";
return;
}
- using SignalProto = NeighborConnectedStateChangedSignal;
+ using SignalProto = NeighborReachabilityEventSignal;
SignalProto proto;
proto.set_ifindex(ifindex);
proto.set_ip_addr(ip_addr.ToString());
+ proto.set_type(event_type);
switch (role) {
case NeighborLinkMonitor::NeighborRole::kGateway:
proto.set_role(SignalProto::GATEWAY);
@@ -817,11 +818,10 @@
NOTREACHED();
}
- dbus::Signal signal(kPatchPanelInterface,
- kNeighborConnectedStateChangedSignal);
+ dbus::Signal signal(kPatchPanelInterface, kNeighborReachabilityEventSignal);
dbus::MessageWriter writer(&signal);
if (!writer.AppendProtoAsArrayOfBytes(proto)) {
- LOG(ERROR) << "Failed to encode proto NeighborConnectedStateChangedSignal";
+ LOG(ERROR) << "Failed to encode proto NeighborReachabilityEventSignal";
return;
}