Skip to content

Add image_id to CRI Container message #123508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 28, 2024

Conversation

saschagrunert
Copy link
Member

@saschagrunert saschagrunert commented Feb 26, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

This new field allows fixing the kubelet image garbage collection in container runtimes. The image_ref has been historically used by container runtimes to reference images by digest.

Which issue(s) this PR fixes:

Allows to fix cri-o/cri-o#7143

Special notes for your reviewer:

cc @mrunalp @haircommander @kubernetes/sig-node-api-reviews

Does this PR introduce a user-facing change?

Allowing container runtimes to fix an image garbage collection bug by adding an `image_id` field to the CRI Container message.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 26, 2024
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Feb 26, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Feb 26, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert
Copy link
Member Author

/retest

@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@saschagrunert
Copy link
Member Author

/test pull-kubernetes-unit

@haircommander
Copy link
Contributor

thanks for picking this up! I believe for backwards compatibility with the kube pod API we need to change https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L1903 to be ImageID: ImageRef (if imageRef is != "", otherwise ImageID) or else we'll do the same thing as if CRI reported ImageID as ImageRef (new, broken cri-o behavior)

@saschagrunert
Copy link
Member Author

saschagrunert commented Feb 27, 2024

thanks for picking this up! I believe for backwards compatibility with the kube pod API we need to change https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L1903 to be ImageID: ImageRef (if imageRef is != "", otherwise ImageID) or else we'll do the same thing as if CRI reported ImageID as ImageRef (new, broken cri-o behavior)

@haircommander the ContainerStatus has not changed in this PR, means we have no field ImageRef at all. Therefore the behavior will also not change. We only changed the Container message, which is the one used for the image GC.

@sftim
Copy link
Contributor

sftim commented Feb 27, 2024

Does this PR make any change that a Kubernetes cluster operator could observe (even indirectly, but without using a debugger or trace logging)?

@saschagrunert
Copy link
Member Author

Does this PR make any change that a Kubernetes cluster operator could observe (even indirectly, but without using a debugger or trace logging)?

No, runtimes do not support that field yet so there will be no change. Only if runtimes set the value of the field, then it will influence how the image GC works.

@sftim
Copy link
Contributor

sftim commented Feb 27, 2024

runtimes do not support that field yet so there will be no change

If I used the release of Kubernetes that included this change, and a container runtime with support, would I notice the behavior? If so, we should change log it so people know it's available.

@mrunalp
Copy link
Contributor

mrunalp commented Feb 27, 2024

@mikebrow @SergeyKanzhelev ptal

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see suggested comment doc edits

@mikebrow
Copy link
Member

thanks for picking this up! I believe for backwards compatibility with the kube pod API we need to change https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L1903 to be ImageID: ImageRef (if imageRef is != "", otherwise ImageID) or else we'll do the same thing as if CRI reported ImageID as ImageRef (new, broken cri-o behavior)

@haircommander the ContainerStatus has not changed in this PR, means we have no field ImageRef at all. Therefore the behavior will also not change. We only changed the Container message, which is the one used for the image GC.

fair to say we should also fix the status response in a similar fashion.. perhaps in another pr?

@haircommander
Copy link
Contributor

can someone help find where we propegate kuberuntime.ContainerStatus to the kube API? From my perspective, we are translating kuberuntime.ImageID -> ContainerStatus.ImageID somewhere but I am struggling to find where. If that's the case, then it feels like this PR would change it so that we're reporting ImageID (which is coming from the CRI as an ImageID, not ImageRef) and then we'd change the ContainerStatus to have the hash and not repo + digest. It sounds like this isn't what is happening though

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Feb 28, 2024
@saschagrunert
Copy link
Member Author

runtimes do not support that field yet so there will be no change

If I used the release of Kubernetes that included this change, and a container runtime with support, would I notice the behavior? If so, we should change log it so people know it's available.

Yes, changed the release note for that.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mikebrow, mrunalp, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2024
@k8s-ci-robot k8s-ci-robot merged commit 68a4705 into kubernetes:master Feb 28, 2024
@k8s-ci-robot k8s-ci-robot added this to the v1.30 milestone Feb 28, 2024
@saschagrunert saschagrunert deleted the image-id-container branch February 29, 2024 08:10
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Feb 29, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert
Copy link
Member Author

There is a image ID test available: https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/image_id_test.go

CRI-O still returns the image ID if the image is already present on disk, that's why the test allows both:

gomega.Equal(busyBoxImage),
gomega.MatchRegexp(`[[:xdigit:]]{64}`),

saschagrunert added a commit to saschagrunert/kubernetes that referenced this pull request Feb 29, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert
Copy link
Member Author

But there is a corner case when developers use ConvertPodStatusToRunningPod and override the field. Closing this gap in #123583

saschagrunert added a commit to saschagrunert/kubernetes that referenced this pull request Feb 29, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Feb 29, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/kubernetes that referenced this pull request Feb 29, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
k8s-publishing-bot pushed a commit to kubernetes/cri-api that referenced this pull request Mar 4, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes/kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>

Kubernetes-commit: e38531e9a2359c2ba1505cb04d62d6810edc616e
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Mar 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Jeffwan pushed a commit to Jeffwan/kubernetes that referenced this pull request Mar 6, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
dinhxuanvu pushed a commit to dinhxuanvu/kubernetes that referenced this pull request Mar 28, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to saschagrunert/cri-o that referenced this pull request Apr 5, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
ah8ad3 pushed a commit to ah8ad3/kubernetes that referenced this pull request Apr 6, 2024
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

https://github.com/kubernetes/kubernetes/blob/411c29c39f03687a30a8667295c61590def8fc89/pkg/kubelet/container/helpers.go#L259-L292

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: kubernetes#123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
haircommander pushed a commit to haircommander/cri-o that referenced this pull request May 2, 2024
Using the implementation of kubernetes/kubernetes#123508
to fix cri-o#7143

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

imageRef in container status is not image ID and lead to kubelet image gc work abnormally
8 participants