GitHub - arvatoaws-labs/k8soptimizer: This tool optimizes the kubernetes workloads resource requsts and limits.

k8soptimizer

This tool optimizes the kubernetes workloads resource requsts and limits.

Say goodbye to resource underutilization and hello to a more efficient Kubernetes experience. Try k8soptimizer today and see the difference for yourself!

Description

Optimizing Kubernetes resource management has never been more efficient! k8soptimizer empowers you to fine-tune your deployments, ensuring optimal performance and resource allocation. Leverage Prometheus metrics to make data-driven decisions, dynamically adjusting resource requests and limits for unprecedented efficiency.

This tool can run once or on a regular schedule (eg. every 2 oder 4 hours) and will adjust the resource requests and limits of your deployments based on the historical resource utilization data from prometheus.

Features

Supports static and dynamic deployments (hpa)
- static deployments get more resources assigned because they cannot scale out
- dynamic deployments get less resources assigned because they can scale out
Support for cpu and memory resources
- The CPU requests are calculated based on a specified percentile of the sum of CPU allocations across all pods.
- Memory requests are determined by a specified percentile of memory usage, calculated from the average memory utilization across all pods.
- Memory limits are set based on a specified percentile of memory usage, derived from the maximum memory utilization across all pods.
Analyze historical resource utilization data using prometheus as a data source
- Queries based on quantile over time which can be adjusted
- Look back one week and predict the resource utilization for the hours
- Look back 4 hours and compared it to one week ago to get a trend
Automatically adjust deployment requests
- Increases memory requests and limits upon discovering OOM kills.
- Caps requests to 1 core for Node.js applications.
- Eliminates CPU limits following best practices (see https://home.robusta.dev/blog/stop-using-cpu-limits)
- Provides flexibility with various thresholds and configurable settings. (see configuration)
Highly tested code using the Pytest framework.
Can be executed as a Docker image.
Supports configuration through environment variables.
Helm Chart to run the k8skoptimizer as a cronjob in k8s

Differences between VPA

Supports running together wtih hpa (they won't fight each other)
Supports lower memory requests (there is no 256MB minimum as default)
Forecasts todays with data from one week ago in order to account for different usage pattern based on weekdays
Compares the last 4 hours with data from the last 4 hours a week ago in order to detect a trend
Updates the deployment object instead of the crated pod
Can also run as a cli command outside of the cluster

Known issues

Flux or other management might rollback the changes

Future ideas

Support for auto discovery of additional runtimes whith specific limitations (python does not consume more than 1 core)
Support for jvm discovery, maybe the memory request can be reduced (right now a java app would not lower memory consumption because it takes all it can get)
Support for statefulesets and daemonsets
Support for kubernetes events (to see oom kills and others useful events)
Dynamic configuration based on namespace or object annotations
Admission controller mode
Better logging and alerting

Quickstart

# cli mode

# install prometheus operator

# apply rules from contrib folder

kubectl apply -f contrib/prometheus-rules.yaml

# port forward prometheus

kubectl port-forward -n monitoring service/prometheus-operator-kube-p-prometheus 9090:9090

# run k8soptimizer

python3 src/k8soptimizer/main.py -n default -v --dry-run

# Modify the configuration to your needs export NAMESPACE_PATTERN="default"

# cluster mode

# install rbac permissions

kubectl apply -f deploy/rbac.yaml

# modify config.yaml to your needs

kubectl apply -f deploy/config.yaml

# deploy the cronjob

kubectl apply -f deploy/cronjob.yaml

# trigger the cronjob manually or wait for the next schedule # verify the logs of the cronjob

Configuration

The following environment variables can be used to configure the behavior of k8soptimizer.

PROMETHEUS_URL

Default: http://localhost:9090
Description: The URL of the Prometheus server used to query resource utilization metrics.

NAMESPACE_PATTERN

Default: .*
Description: A regular expression pattern to filter namespaces for optimization.

DEPLOYMENT_PATTERN

Default: .*
Description: A regular expression pattern to filter deployments for optimization.

CONTAINER_PATTERN

Default: .*
Description: A regular expression pattern to filter container names for optimization.

PROMETHEUS_URL

Default: http://localhost:9090
Description: The URL for the Prometheus server.

DEFAULT_LOOKBACK_MINUTES

Default: 240 (4 hours)
Description: The default lookback time in minutes for queries.

DEFAULT_OFFSET_MINUTES

Default: Computed based on a week minus DEFAULT_LOOKBACK_MINUTES.
Description: The default offset in minutes for queries.

DEFAULT_QUANTILE_OVER_TIME

Default: 0.95
Description: The default quantile value for queries. A higher value will result in more resources being allocated.

DEFAULT_QUANTILE_OVER_TIME_STATIC_CPU

Default: 0.95
Description: Default quantile value for CPU static configurations. A higher value will result in more resources being allocated.

DEFAULT_QUANTILE_OVER_TIME_HPA_CPU

Default: 0.7
Description: Default quantile value for CPU Horizontal Pod Autoscaler (HPA). A higher value will result in more resources being allocated.

DEFAULT_QUANTILE_OVER_TIME_STATIC_MEMORY

Default: 0.95
Description: Default quantile value for memory static configurations. A higher value will result in more resources being allocated.

DEFAULT_QUANTILE_OVER_TIME_HPA_MEMORY

Default: 0.8
Description: Default quantile value for memory Horizontal Pod Autoscaler (HPA). A higher value will result in more resources being allocated.

DRY_RUN_MODE

Default: False
Description: Flag for dry run mode.

MIN_CPU_REQUEST

Default: 0.010
Description: Minimum CPU request value.

MAX_CPU_REQUEST

Default: 16
Description: Maximum CPU request value.

MAX_CPU_REQUEST_NODEJS

Default: 1.0
Description: Maximum CPU request value specifically for Node.js.

CPU_REQUEST_RATIO

Default: 1.0
Description: CPU request ratio. Increase this value to allocate more CPU resources than historical usage.

MIN_MEMORY_REQUEST

Default: 16 MB (1024**2 * 16)
Description: Minimum memory request value in bytes.

MAX_MEMORY_REQUEST

Default: 16 GB (1024**3 * 16)
Description: Maximum memory request value in bytes.

MEMORY_REQUEST_RATIO

Default: 1.5
Description: Memory request ratio. Increase this value to allocate more memory resources than historical usage.

MEMORY_LIMIT_RATIO

Default: 2.0
Description: Memory limit ratio. Increase this value to allow more memory resources than historical usage.

MIN_MEMORY_LIMIT

Default: 16 MB (1024**2 * 16)
Description: Minimum memory limit value in bytes.

MAX_MEMORY_LIMIT

Default: 16 GB (1024**3 * 16)
Description: Maximum memory limit value in bytes.

CHANGE_THRESHOLD

Default: 0.1
Description: Threshold for change.

HPA_TARGET_REPLICAS_RATIO

Default: 0.1
Description: Ratio for Horizontal Pod Autoscaler (HPA) target replicas. This value is limited by the hpa min and max settings. A setting of 0 would result in having only min pods running, a setting of 1 would result in having max pods running.

TREND_LOOKBOOK_MINUTES

Default: 240 (4 hours)
Description: Trend lookback time in minutes.

TREND_OFFSET_MINUTES

Default: 10080 (7 days)
Description: Trend offset in minutes.

TREND_MAX_RATIO

Default: 1.5
Description: Maximum ratio for trends.

TREND_MIN_RATIO

Default: 0.5
Description: Minimum ratio for trends.

TREND_QUANTILE_OVER_TIME

Default: 0.8
Description: Quantile value for trends.

LOG_LEVEL

Default: INFO
Description: Logging level.

LOG_FORMAT

Default: json
Description: Logging format.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github		.github
contrib		contrib
deploy		deploy
docs		docs
helm/k8soptimizer		helm/k8soptimizer
src/k8soptimizer		src/k8soptimizer
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.rst		README.rst
pyproject.toml		pyproject.toml
renovate.json		renovate.json
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

License

arvatoaws-labs/k8soptimizer

Folders and files

Latest commit

History

Repository files navigation

k8soptimizer

Description

Features

Differences between VPA

Known issues

Future ideas

Quickstart

Configuration

PROMETHEUS_URL

NAMESPACE_PATTERN

DEPLOYMENT_PATTERN

CONTAINER_PATTERN

PROMETHEUS_URL

DEFAULT_LOOKBACK_MINUTES

DEFAULT_OFFSET_MINUTES

DEFAULT_QUANTILE_OVER_TIME

DEFAULT_QUANTILE_OVER_TIME_STATIC_CPU

DEFAULT_QUANTILE_OVER_TIME_HPA_CPU

DEFAULT_QUANTILE_OVER_TIME_STATIC_MEMORY

DEFAULT_QUANTILE_OVER_TIME_HPA_MEMORY

DRY_RUN_MODE

MIN_CPU_REQUEST

MAX_CPU_REQUEST

MAX_CPU_REQUEST_NODEJS

CPU_REQUEST_RATIO

MIN_MEMORY_REQUEST

MAX_MEMORY_REQUEST

MEMORY_REQUEST_RATIO

MEMORY_LIMIT_RATIO

MIN_MEMORY_LIMIT

MAX_MEMORY_LIMIT

CHANGE_THRESHOLD

HPA_TARGET_REPLICAS_RATIO

TREND_LOOKBOOK_MINUTES

TREND_OFFSET_MINUTES

TREND_MAX_RATIO

TREND_MIN_RATIO

TREND_QUANTILE_OVER_TIME

LOG_LEVEL

LOG_FORMAT

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages