Kubernetes Autoscale(scaleDown) Go To Alert State and failed after 10min

### problem

i enabeld autoscale with min=1 and max=3. scaleUp done But after removing the load from the cluster, the scaleDown operation is not performed.


I0708 19:03:01.145119       1 static_autoscaler.go:547] Starting scale down
I0708 19:03:01.145149       1 scale_down.go:828] test-node-197eb5ee0c6 was unneeded for 9m57.972258789s
I0708 19:03:01.145159       1 scale_down.go:828] test-node-197eb5e9290 was unneeded for 9m57.972258789s
I0708 19:03:01.145171       1 scale_down.go:917] No candidates for scale down
I0708 19:03:06.599830       1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.Pod total 10 items received
I0708 19:03:11.156997       1 static_autoscaler.go:235] Starting main loop
I0708 19:03:11.157208       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=listKubernetesClusters&id=f8232f4f-6c7b-4574-b6df-8e633ee6492b&response=json&signature=***
I0708 19:03:11.229389       1 client.go:175] NewAPIRequest response status code:200
I0708 19:03:11.229935       1 cloudstack_manager.go:88] Got cluster : &{f8232f4f-6c7b-4574-b6df-8e633ee6492b test 1 3 3 1 [0xc000fbf170 0xc000fbf1a0 0xc000fbf1d0 0xc000fbf200] map[test-control-197eb5404af:0xc000fbf170 test-node-197eb58e504:0xc000fbf1a0 test-node-197eb5e9290:0xc000fbf1d0 test-node-197eb5ee0c6:0xc000fbf200]}
W0708 19:03:11.230527       1 clusterstate.go:590] Failed to get nodegroup for 9bccb477-34ee-414b-9298-b927afe80503: Unable to find node 9bccb477-34ee-414b-9298-b927afe80503 in cluster
W0708 19:03:11.230548       1 clusterstate.go:590] Failed to get nodegroup for e8cfee6c-b8fa-4213-9d38-d7815490fa3e: Unable to find node e8cfee6c-b8fa-4213-9d38-d7815490fa3e in cluster
W0708 19:03:11.230557       1 clusterstate.go:590] Failed to get nodegroup for 10dfe63f-21e2-451a-ba69-8340c3625fef: Unable to find node 10dfe63f-21e2-451a-ba69-8340c3625fef in cluster
W0708 19:03:11.230566       1 clusterstate.go:590] Failed to get nodegroup for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
I0708 19:03:11.230593       1 static_autoscaler.go:341] 4 unregistered nodes present
I0708 19:03:11.230600       1 static_autoscaler.go:624] Removing unregistered node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776
W0708 19:03:11.230610       1 static_autoscaler.go:627] Failed to get node group for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
W0708 19:03:11.230618       1 static_autoscaler.go:346] Failed to remove unregistered nodes: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
I0708 19:03:11.230634       1 filter_out_schedulable.go:65] Filtering out schedulables
I0708 19:03:11.230643       1 filter_out_schedulable.go:137] Filtered out 0 pods using hints
I0708 19:03:11.230650       1 filter_out_schedulable.go:175] 0 pods were kept as unschedulable based on caching
I0708 19:03:11.230656       1 filter_out_schedulable.go:176] 0 pods marked as unschedulable can be scheduled.
I0708 19:03:11.230664       1 filter_out_schedulable.go:87] No schedulable pods
I0708 19:03:11.230676       1 static_autoscaler.go:433] No unschedulable pods
I0708 19:03:11.230688       1 static_autoscaler.go:480] Calculating unneeded nodes
I0708 19:03:11.230704       1 scale_down.go:418] Skipping test-control-197eb5404af from delete consideration - the node is marked as no scale down
I0708 19:03:11.230722       1 scale_down.go:448] Node test-node-197eb58e504 - memory utilization 0.136988
I0708 19:03:11.230738       1 scale_down.go:448] Node test-node-197eb5ee0c6 - cpu utilization 0.062500
I0708 19:03:11.230749       1 scale_down.go:448] Node test-node-197eb5e9290 - cpu utilization 0.062500
I0708 19:03:11.230782       1 scale_down.go:563] Finding additional 1 candidates for scale down.
I0708 19:03:11.230873       1 cluster.go:139] test-node-197eb58e504 for removal
I0708 19:03:11.230993       1 cluster.go:150] node test-node-197eb58e504 cannot be removed: non-daemonset, non-mirrored, non-pdb-assigned kube-system pod present: cluster-autoscaler-6b58bb74bd-n8v96
I0708 19:03:11.231008       1 scale_down.go:612] 1 nodes found to be unremovable in simulation, will re-check them at 2025-07-08 19:08:11.156961716 +0000 UTC m=+1275.265299051
I0708 19:03:11.231038       1 static_autoscaler.go:523] test-node-197eb5ee0c6 is unneeded since 2025-07-08 18:53:03.090583744 +0000 UTC m=+367.198921083 duration 10m8.066377968s
I0708 19:03:11.231053       1 static_autoscaler.go:523] test-node-197eb5e9290 is unneeded since 2025-07-08 18:53:03.090583744 +0000 UTC m=+367.198921083 duration 10m8.066377968s
I0708 19:03:11.231066       1 static_autoscaler.go:534] Scale down status: unneededOnly=false lastScaleUpTime=2025-07-08 18:48:52.589073816 +0000 UTC m=+116.697411157 lastScaleDownDeleteTime=2025-07-08 17:46:59.590592708 +0000 UTC m=-3596.301069916 lastScaleDownFailTime=2025-07-08 17:46:59.590592708 +0000 UTC m=-3596.301069916 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false
I0708 19:03:11.231085       1 static_autoscaler.go:547] Starting scale down
I0708 19:03:11.231115       1 scale_down.go:828] test-node-197eb5ee0c6 was unneeded for 10m8.066377968s
I0708 19:03:11.231127       1 scale_down.go:828] test-node-197eb5e9290 was unneeded for 10m8.066377968s
I0708 19:03:11.231146       1 scale_down.go:1102] Scale-down: removing empty node test-node-197eb5ee0c6
I0708 19:03:11.231929       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"ecf5a84a-e03c-448a-a522-d1ccad99a876", APIVersion:"v1", ResourceVersion:"5387", FieldPath:""}): type: 'Normal' reason: 'ScaleDownEmpty' Scale-down: removing empty node test-node-197eb5ee0c6
I0708 19:03:11.245980       1 delete.go:103] Successfully added ToBeDeletedTaint on node test-node-197eb5ee0c6
I0708 19:03:11.246032       1 scale_down.go:1102] Scale-down: removing empty node test-node-197eb5e9290
I0708 19:03:11.289324       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=scaleKubernetesCluster&id=f8232f4f-6c7b-4574-b6df-8e633ee6492b&nodeids=10dfe63f-21e2-451a-ba69-8340c3625fef&response=json&signature=***
I0708 19:03:11.291007       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"ecf5a84a-e03c-448a-a522-d1ccad99a876", APIVersion:"v1", ResourceVersion:"5387", FieldPath:""}): type: 'Normal' reason: 'ScaleDownEmpty' Scale-down: removing empty node test-node-197eb5e9290
I0708 19:03:11.291847       1 delete.go:103] Successfully added ToBeDeletedTaint on node test-node-197eb5e9290
I0708 19:03:11.434030       1 client.go:175] NewAPIRequest response status code:200
I0708 19:03:14.626060       1 reflector.go:255] Listing and watching *v1beta1.CSIStorageCapacity from k8s.io/client-go/informers/factory.go:134
W0708 19:03:14.628689       1 reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0708 19:03:14.628744       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
I0708 19:03:14.697103       1 reflector.go:536] /home/djumani/lab/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:356: Watch close - *v1.StatefulSet total 11 items received
I0708 19:03:16.002359       1 reflector.go:536] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.Namespace total 11 items received
I0708 19:03:21.304032       1 static_autoscaler.go:235] Starting main loop
I0708 19:03:21.434565       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=queryAsyncJobResult&jobid=26e9ce54-e20d-4647-aacf-110ab8d69e12&response=json&signature=***
I0708 19:03:21.512175       1 client.go:175] NewAPIRequest response status code:200
E0708 19:03:21.512495       1 client.go:118] API failed for job 26e9ce54-e20d-4647-aacf-110ab8d69e12 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:12+0000 created:2025-07-08T19:03:11+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:26e9ce54-e20d-4647-aacf-110ab8d69e12 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Scaling failed for Kubernetes cluster : test, failed to remove Kubernetes node: test-node-197eb5ee0c6 running on VM : test-node-197eb5ee0c6] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
E0708 19:03:21.512601       1 scale_down.go:1144] Problem with empty node deletion: failed to delete test-node-197eb5ee0c6: Unable to delete [10dfe63f-21e2-451a-ba69-8340c3625fef] from cluster : API failed for job 26e9ce54-e20d-4647-aacf-110ab8d69e12 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:12+0000 created:2025-07-08T19:03:11+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:26e9ce54-e20d-4647-aacf-110ab8d69e12 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Scaling failed for Kubernetes cluster : test, failed to remove Kubernetes node: test-node-197eb5ee0c6 running on VM : test-node-197eb5ee0c6] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
I0708 19:03:21.513134       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=scaleKubernetesCluster&id=f8232f4f-6c7b-4574-b6df-8e633ee6492b&nodeids=e8cfee6c-b8fa-4213-9d38-d7815490fa3e&response=json&signature=***
I0708 19:03:21.517426       1 delete.go:197] Releasing taint {Key:ToBeDeletedByClusterAutoscaler Value:1752001391 Effect:NoSchedule TimeAdded:<nil>} on node test-node-197eb5ee0c6
I0708 19:03:21.589842       1 delete.go:228] Successfully released ToBeDeletedTaint on node test-node-197eb5ee0c6
I0708 19:03:21.590014       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"test-node-197eb5ee0c6", UID:"3e1c275f-2325-40ec-b93a-82288da5d3db", APIVersion:"v1", ResourceVersion:"5325", FieldPath:""}): type: 'Warning' reason: 'ScaleDownFailed' failed to delete empty node: failed to delete test-node-197eb5ee0c6: Unable to delete [10dfe63f-21e2-451a-ba69-8340c3625fef] from cluster : API failed for job 26e9ce54-e20d-4647-aacf-110ab8d69e12 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:12+0000 created:2025-07-08T19:03:11+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:26e9ce54-e20d-4647-aacf-110ab8d69e12 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Scaling failed for Kubernetes cluster : test, failed to remove Kubernetes node: test-node-197eb5ee0c6 running on VM : test-node-197eb5ee0c6] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
I0708 19:03:22.011583       1 client.go:175] NewAPIRequest response status code:200
I0708 19:03:32.012495       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=queryAsyncJobResult&jobid=8f8c699e-beab-4ccb-a71c-233416e9db59&response=json&signature=***
I0708 19:03:32.088276       1 client.go:175] NewAPIRequest response status code:200
E0708 19:03:32.088569       1 client.go:118] API failed for job 8f8c699e-beab-4ccb-a71c-233416e9db59 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:22+0000 created:2025-07-08T19:03:21+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:8f8c699e-beab-4ccb-a71c-233416e9db59 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Kubernetes cluster test is in Alert state and can not be scaled] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
E0708 19:03:32.088642       1 scale_down.go:1144] Problem with empty node deletion: failed to delete test-node-197eb5e9290: Unable to delete [e8cfee6c-b8fa-4213-9d38-d7815490fa3e] from cluster : API failed for job 8f8c699e-beab-4ccb-a71c-233416e9db59 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:22+0000 created:2025-07-08T19:03:21+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:8f8c699e-beab-4ccb-a71c-233416e9db59 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Kubernetes cluster test is in Alert state and can not be scaled] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
I0708 19:03:32.088925       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=listKubernetesClusters&id=f8232f4f-6c7b-4574-b6df-8e633ee6492b&response=json&signature=***
I0708 19:03:32.094055       1 delete.go:197] Releasing taint {Key:ToBeDeletedByClusterAutoscaler Value:1752001391 Effect:NoSchedule TimeAdded:<nil>} on node test-node-197eb5e9290
I0708 19:03:32.105627       1 delete.go:228] Successfully released ToBeDeletedTaint on node test-node-197eb5e9290
I0708 19:03:32.105690       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"test-node-197eb5e9290", UID:"70b091b5-8b83-4f4f-8955-8adb05164a96", APIVersion:"v1", ResourceVersion:"5356", FieldPath:""}): type: 'Warning' reason: 'ScaleDownFailed' failed to delete empty node: failed to delete test-node-197eb5e9290: Unable to delete [e8cfee6c-b8fa-4213-9d38-d7815490fa3e] from cluster : API failed for job 8f8c699e-beab-4ccb-a71c-233416e9db59 : map[account:admin accountid:e186da76-1aac-11ef-b9f4-005056a93347 cmd:org.apache.cloudstack.api.command.user.kubernetes.cluster.ScaleKubernetesClusterCmd completed:2025-07-08T19:03:22+0000 created:2025-07-08T19:03:21+0000 domainid:ff4c1425-1a67-11ef-9664-d67dd6fc2fed domainpath:ROOT jobid:8f8c699e-beab-4ccb-a71c-233416e9db59 jobprocstatus:0 jobresult:map[errorcode:530 errortext:Kubernetes cluster test is in Alert state and can not be scaled] jobresultcode:530 jobresulttype:object jobstatus:2 userid:4988af17-e62b-4d06-aee7-cb6c065aee6f]
I0708 19:03:32.162977       1 client.go:175] NewAPIRequest response status code:200
I0708 19:03:32.163512       1 cloudstack_manager.go:88] Got cluster : &{f8232f4f-6c7b-4574-b6df-8e633ee6492b test 1 3 3 1 [0xc000e183f0 0xc000e18420 0xc000e18450 0xc000e18480] map[test-control-197eb5404af:0xc000e183f0 test-node-197eb58e504:0xc000e18420 test-node-197eb5e9290:0xc000e18450 test-node-197eb5ee0c6:0xc000e18480]}
I0708 19:03:32.163548       1 metrics.go:380] Function cloudProviderRefresh took 10.859413386s to complete
W0708 19:03:32.163980       1 clusterstate.go:590] Failed to get nodegroup for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
W0708 19:03:32.164017       1 clusterstate.go:590] Failed to get nodegroup for 9bccb477-34ee-414b-9298-b927afe80503: Unable to find node 9bccb477-34ee-414b-9298-b927afe80503 in cluster
W0708 19:03:32.164026       1 clusterstate.go:590] Failed to get nodegroup for e8cfee6c-b8fa-4213-9d38-d7815490fa3e: Unable to find node e8cfee6c-b8fa-4213-9d38-d7815490fa3e in cluster
W0708 19:03:32.164034       1 clusterstate.go:590] Failed to get nodegroup for 10dfe63f-21e2-451a-ba69-8340c3625fef: Unable to find node 10dfe63f-21e2-451a-ba69-8340c3625fef in cluster
I0708 19:03:32.164054       1 metrics.go:380] Function updateClusterState took 10.859971316s to complete
I0708 19:03:32.164071       1 static_autoscaler.go:341] 4 unregistered nodes present
I0708 19:03:32.164091       1 static_autoscaler.go:624] Removing unregistered node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776
W0708 19:03:32.164106       1 static_autoscaler.go:627] Failed to get node group for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
W0708 19:03:32.164116       1 static_autoscaler.go:346] Failed to remove unregistered nodes: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
I0708 19:03:32.164403       1 filter_out_schedulable.go:65] Filtering out schedulables
I0708 19:03:32.164433       1 filter_out_schedulable.go:137] Filtered out 0 pods using hints
I0708 19:03:32.164440       1 filter_out_schedulable.go:175] 0 pods were kept as unschedulable based on caching
I0708 19:03:32.164445       1 filter_out_schedulable.go:176] 0 pods marked as unschedulable can be scheduled.
I0708 19:03:32.164454       1 filter_out_schedulable.go:87] No schedulable pods
I0708 19:03:32.164466       1 static_autoscaler.go:433] No unschedulable pods
I0708 19:03:32.164480       1 static_autoscaler.go:480] Calculating unneeded nodes
I0708 19:03:32.164494       1 scale_down.go:418] Skipping test-control-197eb5404af from delete consideration - the node is marked as no scale down
I0708 19:03:32.164502       1 scale_down.go:412] Skipping test-node-197eb5ee0c6 from delete consideration - the node is currently being deleted
I0708 19:03:32.164508       1 scale_down.go:412] Skipping test-node-197eb5e9290 from delete consideration - the node is currently being deleted
I0708 19:03:32.164514       1 scale_down.go:509] Scale-down calculation: ignoring 1 nodes unremovable in the last 5m0s
I0708 19:03:32.164538       1 static_autoscaler.go:534] Scale down status: unneededOnly=false lastScaleUpTime=2025-07-08 18:48:52.589073816 +0000 UTC m=+116.697411157 lastScaleDownDeleteTime=2025-07-08 19:03:11.156961716 +0000 UTC m=+975.265299051 lastScaleDownFailTime=2025-07-08 17:46:59.590592708 +0000 UTC m=-3596.301069916 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false
I0708 19:03:32.164582       1 static_autoscaler.go:547] Starting scale down
I0708 19:03:32.164619       1 scale_down.go:917] No candidates for scale down
I0708 19:03:32.174788       1 metrics.go:380] Function main took 10.870818976s to complete
I0708 19:03:41.208211       1 reflector.go:255] Listing and watching *v1beta1.PodDisruptionBudget from /home/djumani/lab/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:309
W0708 19:03:41.210565       1 reflector.go:324] /home/djumani/lab/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:309: failed to list *v1beta1.PodDisruptionBudget: the server could not find the requested resource
E0708 19:03:41.210610       1 reflector.go:138] /home/djumani/lab/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:309: Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: the server could not find the requested resource
I0708 19:03:42.175530       1 static_autoscaler.go:235] Starting main loop
I0708 19:03:42.175942       1 client.go:169] NewAPIRequest API request URL:http://endpoint0.k8s.cloud.net/client/api?apiKey=***&command=listKubernetesClusters&id=f8232f4f-6c7b-4574-b6df-8e633ee6492b&response=json&signature=***
I0708 19:03:42.261887       1 client.go:175] NewAPIRequest response status code:200
I0708 19:03:42.262465       1 cloudstack_manager.go:88] Got cluster : &{f8232f4f-6c7b-4574-b6df-8e633ee6492b test 1 3 3 1 [0xc001500d50 0xc001500d80 0xc001500db0 0xc001500de0] map[test-control-197eb5404af:0xc001500d50 test-node-197eb58e504:0xc001500d80 test-node-197eb5e9290:0xc001500db0 test-node-197eb5ee0c6:0xc001500de0]}
I0708 19:03:42.262693       1 taints.go:77] Removing autoscaler soft taint when creating template from node
W0708 19:03:42.262989       1 clusterstate.go:590] Failed to get nodegroup for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
W0708 19:03:42.263006       1 clusterstate.go:590] Failed to get nodegroup for 9bccb477-34ee-414b-9298-b927afe80503: Unable to find node 9bccb477-34ee-414b-9298-b927afe80503 in cluster
W0708 19:03:42.263015       1 clusterstate.go:590] Failed to get nodegroup for e8cfee6c-b8fa-4213-9d38-d7815490fa3e: Unable to find node e8cfee6c-b8fa-4213-9d38-d7815490fa3e in cluster
W0708 19:03:42.263023       1 clusterstate.go:590] Failed to get nodegroup for 10dfe63f-21e2-451a-ba69-8340c3625fef: Unable to find node 10dfe63f-21e2-451a-ba69-8340c3625fef in cluster
I0708 19:03:42.263051       1 static_autoscaler.go:341] 4 unregistered nodes present
I0708 19:03:42.263064       1 static_autoscaler.go:624] Removing unregistered node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776
W0708 19:03:42.263073       1 static_autoscaler.go:627] Failed to get node group for 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
W0708 19:03:42.263080       1 static_autoscaler.go:346] Failed to remove unregistered nodes: Unable to find node 1a1429cc-1a6b-4971-a9d3-d1d5bd8ac776 in cluster
I0708 19:03:42.263099       1 filter_out_schedulable.go:65] Filtering out schedulables
I0708 19:03:42.263107       1 filter_out_schedulable.go:137] Filtered out 0 pods using hints
I0708 19:03:42.263113       1 filter_out_schedulable.go:175] 0 pods were kept as unschedulable based on caching
I0708 19:03:42.263119       1 filter_out_schedulable.go:176] 0 pods marked as unschedulable can be scheduled.
I0708 19:03:42.263127       1 filter_out_schedulable.go:87] No schedulable pods
I0708 19:03:42.263139       1 static_autoscaler.go:433] No unschedulable pods
I0708 19:03:42.263154       1 static_autoscaler.go:480] Calculating unneeded nodes
I0708 19:03:42.263170       1 scale_down.go:418] Skipping test-control-197eb5404af from delete consideration - the node is marked as no scale down
I0708 19:03:42.263189       1 scale_down.go:448] Node test-node-197eb5ee0c6 - cpu utilization 0.062500
I0708 19:03:42.263206       1 scale_down.go:448] Node test-node-197eb5e9290 - cpu utilization 0.062500
I0708 19:03:42.263213       1 scale_down.go:509] Scale-down calculation: ignoring 1 nodes unremovable in the last 5m0s
I0708 19:03:42.263262       1 static_autoscaler.go:523] test-node-197eb5ee0c6 is unneeded since 2025-07-08 19:03:42.175479959 +0000 UTC m=+1006.283817346 duration 0s
I0708 19:03:42.263278       1 static_autoscaler.go:523] test-node-197eb5e9290 is unneeded since 2025-07-08 19:03:42.175479959 +0000 UTC m=+1006.283817346 duration 0s
I0708 19:03:42.263292       1 static_autoscaler.go:534] Scale down status: unneededOnly=false lastScaleUpTime=2025-07-08 18:48:52.589073816 +0000 UTC m=+116.697411157 lastScaleDownDeleteTime=2025-07-08 19:03:11.156961716 +0000 UTC m=+975.265299051 lastScaleDownFailTime=2025-07-08 17:46:59.590592708 +0000 UTC m=-3596.301069916 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false
I0708 19:03:42.263312       1 static_autoscaler.go:547] Starting scale down
I0708 19:03:42.263340       1 scale_down.go:828] test-node-197eb5ee0c6 was unneeded for 0s
I0708 19:03:42.263349       1 scale_down.go:828] test-node-197eb5e9290 was unneeded for 0s
I0708 19:03:42.263361       1 scale_down.go:917] No candidates for scale down

and my autoscale pod yaml is:
apiVersion: v1
kind: Pod
metadata:
  annotations:
    prometheus.io/port: "8085"
    prometheus.io/scrape: "true"
  creationTimestamp: "2025-07-08T18:46:54Z"
  generateName: cluster-autoscaler-6b58bb74bd-
  generation: 1
  labels:
    app: cluster-autoscaler
    pod-template-hash: 6b58bb74bd
  name: cluster-autoscaler-6b58bb74bd-n8v96
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: cluster-autoscaler-6b58bb74bd
    uid: 52d00f4f-0c25-418a-b93d-def73da7db85
  resourceVersion: "936"
  uid: a3153638-7f93-4dd1-82a5-7f339f09e4ea
spec:
  containers:
  - command:
    - ./cluster-autoscaler
    - --v=4
    - --stderrthreshold=info
    - --cloud-provider=cloudstack
    - --cloud-config=/config/cloud-config
    - --skip-nodes-with-local-storage=false
    - --nodes=1:3:f8232f4f-6c7b-4574-b6df-8e633ee6492b
    image: apache/cloudstack-kubernetes-autoscaler:latest
    imagePullPolicy: IfNotPresent
    name: cluster-autoscaler
    resources:
      limits:
        cpu: 100m
        memory: 300Mi
      requests:
        cpu: 100m
        memory: 300Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /config
      name: cloud-config
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-dxffx
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: test-node-197eb58e504
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: cluster-autoscaler
  serviceAccountName: cluster-autoscaler
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: cloud-config
    secret:
      defaultMode: 420
      secretName: cloudstack-secret
  - name: kube-api-access-dxffx
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

### versions

CloudStack 4.19.1.3
kubernetes V1.33.1

### The steps to reproduce the bug

1.install k8s version 1.33.1 without HA enable
2. enable Autoscale min=1 max=3
3. apply an instance of deployment with replica 100 for scaleup
4. delete it for scaleDown
...


### What to do about it?

I just use rollout restart for autoscale deployment but it didn't make a difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kubernetes Autoscale(scaleDown) Go To Alert State and failed after 10min #11166

problem

versions

The steps to reproduce the bug

What to do about it?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Kubernetes Autoscale(scaleDown) Go To Alert State and failed after 10min #11166

Description

problem

versions

The steps to reproduce the bug

What to do about it?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions