Fix netdev tests on CI#5324
Conversation
The runc code never changes the mac address and in some distros these tests become unstable due to changes in the distros default network configuration. We can play games to set it, or even it seems just "udevadm settle" or using the mac when creating the device at the right time might do it. But those tricks might break in the future with other changes in the distros. We literally do nothing about the mac. If in the future the spec changes and we need to do something about it, let's figure it out then how to test this reliably. Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
|
cc @aojea |
|
LGTM, I never thought on that and the idea was to assert on interface attributes, so I do not think is relevant what type of interface we assert to meanwhile is stabe ... we already do on address and mtu Thanks by the way |
|
OTOH different distros may have different rules (not limited to changing the MAC address), and when we add a device, we need to wait so application of those rules are not racing with what we do. I am somewhat hesitant to open yet-another-PR to fix this but here's the patch: kolyshkin@63e0cbc |
Once we add a new network device, systemd-udev may execute some rules. In particular, we see that on Fedora it sets the MAC address (presumably based on the host name and device name). This setting races with ours 'ip link set address', as a result, "checkpoint and restore with netdevice" test sometimes fails telling the MAC address is not as expected. In the future there may be some other udev rules etc., so the overall solution is to wait until systemd-udev is finished applying the rules, thus eliminating the race. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We are creating the interface for every test, but there is only one using it. Let's just call the function to create the netdev on the test that uses it. I guess that was the reason we had the "ip link del ..." in teardown. Because in a lot of tests we were just creating and deleting the interface on the host. While we are there, as suggested by lifubang, let's make the "ip link add" line specify the mtu. This way, the interface is not created without that info and we race with host daemons (like udev) that _might_ want to change it. Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
The cleaning is condition on this variable being set. So let's unset it after we clean the resources. Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
We try to delete the interface, but it lot of tests it won't be there unless we failed to move it to the container. Let's just clarify that in a comment and redirect the error output to /dev/null, as it seems an error otherwise while it is completely normal. Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
|
@kolyshkin @lifubang okay, so I took a little of all approaches here now. This PR:
|
kolyshkin
left a comment
There was a problem hiding this comment.
I like the overall approach (of combining various things). Perhaps we can also unify create_netns and delete_netns and move these two to helpers.bash?
|
commented in the original issue, we just need to add |
|
@aojea yes, this is what this PR was doing initially (only that). But I'm not convinced it makes sense to test the MAC doesn't change here (if you press me I'd say the same for MTU :-D). We are literally not doing anything, if the mac changes are host configuration changes... Why would we care in CI about that? I didn't find any compelling argument, tbh. |
Maybe it's because is 7pm here and I'm tired, but I'm not sure about it. I think a little of duplication in test setup code is fine, we don't know how they may evolve over time. It's not that if we do a change in one, we for sure want to make the change in the other. But if you feel strongly, I can change it tomorrow (or feel free to push a fix to this branch :)) |
Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
those are static properties of a netdev , I honbestly don't know if there is an implicit or explicit contract in the kernel but I assume they can not change since is only a network namespace change |
|
|
||
| # Create a dummy interface to move to the container. | ||
| ip link add dummy0 mtu $mtu_value type dummy | ||
| udevadm settle |
There was a problem hiding this comment.
udevadm settle has to be after the interface is created, that is when the udev event is generated
There was a problem hiding this comment.
comment just as note for self
|
So, is this ready to merge or do you want me to delete the first commit that removes the mac addr changes? |
|
LGTM |
|
Taking this out of the 1.5 milestone because CI on the 1.5 branch doesn't seem broken by this (it seems rawhide is not on the 1.5 branch). |
I propose to just remove the MAC address assertions from tests. As I mention in the commit msg (below), we do literally nothing about the mac on the runc code. If it changes, it is because of the host configuration, it is not relevant for our tests.
We can test with setting the mac at device creation time or do the udevadm settle (@lifubang tested the former, I tested the latter, it seems both fix the issue). But if that stops working, it is a bug in the distros (or maybe they willing change it and they don't conside it a bug), I think it's just simpler and better if our CI doesn't break because of that.
If in the future we change the spec to change the MAC (I doubt this will ever happen, this should be done on the host), let's figure it out then how to test this reliably on distros.
@kolyshkin @lifubang what do you think?
The runc code never changes the mac address and in some distros these
tests become unstable due to changes in the distros default network
configuration. We can play games to set it, or even it seems just
"udevadm settle" or using the mac when creating the device at the right
time might do it.
But those tricks might break in the future with other changes in the
distros. We literally do nothing about the mac. If in the future the
spec changes and we need to do something about it, let's figure it out
then how to test this reliably.