Skip to content

Fix netdev tests on CI#5324

Open
rata wants to merge 6 commits into
opencontainers:mainfrom
rata:tests/netdev-fixes
Open

Fix netdev tests on CI#5324
rata wants to merge 6 commits into
opencontainers:mainfrom
rata:tests/netdev-fixes

Conversation

@rata

@rata rata commented Jun 16, 2026

Copy link
Copy Markdown
Member

I propose to just remove the MAC address assertions from tests. As I mention in the commit msg (below), we do literally nothing about the mac on the runc code. If it changes, it is because of the host configuration, it is not relevant for our tests.

We can test with setting the mac at device creation time or do the udevadm settle (@lifubang tested the former, I tested the latter, it seems both fix the issue). But if that stops working, it is a bug in the distros (or maybe they willing change it and they don't conside it a bug), I think it's just simpler and better if our CI doesn't break because of that.

If in the future we change the spec to change the MAC (I doubt this will ever happen, this should be done on the host), let's figure it out then how to test this reliably on distros.

@kolyshkin @lifubang what do you think?


The runc code never changes the mac address and in some distros these
tests become unstable due to changes in the distros default network
configuration. We can play games to set it, or even it seems just
"udevadm settle" or using the mac when creating the device at the right
time might do it.

But those tricks might break in the future with other changes in the
distros. We literally do nothing about the mac. If in the future the
spec changes and we need to do something about it, let's figure it out
then how to test this reliably.

The runc code never changes the mac address and in some distros these
tests become unstable due to changes in the distros default network
configuration. We can play games to set it, or even it seems just
"udevadm settle" or using the mac when creating the device at the right
time might do it.

But those tricks might break in the future with other changes in the
distros. We literally do nothing about the mac. If in the future the
spec changes and we need to do something about it, let's figure it out
then how to test this reliably.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
@rata rata force-pushed the tests/netdev-fixes branch from 43ed160 to c6ccc1c Compare June 16, 2026 17:02
@rata rata changed the title WIP: Debug mac issues on Fedora Fix netdev tests on CI Jun 16, 2026
@rata rata marked this pull request as ready for review June 16, 2026 17:05
@rata rata requested review from kolyshkin and lifubang June 16, 2026 17:05
@rata

rata commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

cc @aojea

@aojea

aojea commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

LGTM, I never thought on that and the idea was to assert on interface attributes, so I do not think is relevant what type of interface we assert to meanwhile is stabe ... we already do on address and mtu

Thanks by the way

@kolyshkin

Copy link
Copy Markdown
Contributor

OTOH different distros may have different rules (not limited to changing the MAC address), and when we add a device, we need to wait so application of those rules are not racing with what we do.

#5310 (comment)

I am somewhat hesitant to open yet-another-PR to fix this but here's the patch: kolyshkin@63e0cbc

Once we add a new network device, systemd-udev may execute some rules.
In particular, we see that on Fedora it sets the MAC address (presumably
based on the host name and device name). This setting races with ours
'ip link set address', as a result, "checkpoint and restore with netdevice"
test sometimes fails telling the MAC address is not as expected.

In the future there may be some other udev rules etc., so the overall
solution is to wait until systemd-udev is finished applying the rules,
thus eliminating the race.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Comment thread tests/integration/netdev.bats
Comment thread tests/integration/checkpoint.bats Outdated
rata added 3 commits June 17, 2026 17:06
We are creating the interface for every test, but there is only one
using it. Let's just call the function to create the netdev on the test
that uses it.

I guess that was the reason we had the "ip link del ..." in teardown.
Because in a lot of tests we were just creating and deleting the
interface on the host.

While we are there, as suggested by lifubang, let's make the "ip link
add" line specify the mtu. This way, the interface is not created
without that info and we race with host daemons (like udev) that _might_
want to change it.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
The cleaning is condition on this variable being set. So let's unset it
after we clean the resources.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
We try to delete the interface, but it lot of tests it won't be there
unless we failed to move it to the container. Let's just clarify that in
a comment and redirect the error output to /dev/null, as it seems an
error otherwise while it is completely normal.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
@rata rata force-pushed the tests/netdev-fixes branch from 935f981 to 67333cd Compare June 17, 2026 15:06
@rata

rata commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

@kolyshkin @lifubang okay, so I took a little of all approaches here now. This PR:

  • Removes the check of the MAC address. As I said, we are not doing anything to it, so changes on the MAC only mean different host configuration. I'm fine if people want to keep it, I can remove this commit. It caused enough issues on CI for us that I lean towards just removing it.
  • @kolyshkin commit to do the settle. This is what this PR was doing in the first iteration (when it was a draft). It makes sense in case other attributes are changed by the host in the future and it is the right way to do it anyways.
  • Moved to specify the mtu on the "ip link add" in checkpoint.bats, as lifubang suggested.
  • Other small fixes that I found along the way

@kolyshkin kolyshkin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the overall approach (of combining various things). Perhaps we can also unify create_netns and delete_netns and move these two to helpers.bash?

Comment thread tests/integration/checkpoint.bats
@aojea

aojea commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

commented in the original issue, we just need to add udevadm settle to avoid the race with systemd-udevd , that is the one that fights to set the mac address with the test script

@rata

rata commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

@aojea yes, this is what this PR was doing initially (only that). But I'm not convinced it makes sense to test the MAC doesn't change here (if you press me I'd say the same for MTU :-D). We are literally not doing anything, if the mac changes are host configuration changes... Why would we care in CI about that? I didn't find any compelling argument, tbh.

@rata

rata commented Jun 18, 2026

Copy link
Copy Markdown
Member Author

@kolyshkin

I like the overall approach (of combining various things). Perhaps we can also unify create_netns and delete_netns and move these two to helpers.bash?

Maybe it's because is 7pm here and I'm tired, but I'm not sure about it. I think a little of duplication in test setup code is fine, we don't know how they may evolve over time. It's not that if we do a change in one, we for sure want to make the change in the other. But if you feel strongly, I can change it tomorrow (or feel free to push a fix to this branch :))

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
@rata rata force-pushed the tests/netdev-fixes branch from 0503d46 to 37fabc7 Compare June 18, 2026 17:05
@aojea

aojea commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

(only that). But I'm not convinced it makes sense to test the MAC doesn't change here (if you press me I'd say the same for MTU :-D).

those are static properties of a netdev , I honbestly don't know if there is an implicit or explicit contract in the kernel but I assume they can not change since is only a network namespace change


# Create a dummy interface to move to the container.
ip link add dummy0 mtu $mtu_value type dummy
udevadm settle

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

udevadm settle has to be after the interface is created, that is when the udev event is generated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment just as note for self

@cyphar cyphar added this to the 1.5.0 milestone Jun 18, 2026
@rata

rata commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

So, is this ready to merge or do you want me to delete the first commit that removes the mac addr changes?

@rata rata requested a review from kolyshkin June 19, 2026 09:06
@aojea

aojea commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

LGTM

@rata rata removed this from the 1.5.0 milestone Jun 19, 2026
@rata

rata commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

Taking this out of the 1.5 milestone because CI on the 1.5 branch doesn't seem broken by this (it seems rawhide is not on the 1.5 branch).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants