Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect logs from files in container, especially Kubernetes and Docker #19444

Closed
h0cheung opened this issue Mar 10, 2023 · 14 comments
Closed

Collect logs from files in container, especially Kubernetes and Docker #19444

h0cheung opened this issue Mar 10, 2023 · 14 comments
Labels
enhancement New feature or request processor/k8sattributes k8s Attributes processor receiver/filelog Stale

Comments

@h0cheung
Copy link
Contributor

h0cheung commented Mar 10, 2023

Component(s)

processor/k8sattributes, receiver/filelog

Is your feature request related to a problem? Please describe.

Output logs to stdout/stderr, then let container log driver collect them, is regarded as “best practice”. For example, with Kubernetes, we can collect stdout/stderr logs from /var/log/pods/, and parse the Pod UID from path, then get everything about the pod by k8sattributes.
However, in some case, we will still output log to a file. So, we need to collect logs from files in container.

With Kubernetes overlay driver, we can read files from /var/lib/docker/overlay2/OVERLAY_ID/upper, or /run/containerd/io.containerd.runtime.v2.task/k8s.io/CONTAINER_ID/rootfs.
However:

  • I didn't find a way to get pod info for them in current otelcol.
  • There are several other drivers rather than overlay

Describe the solution you'd like

ilogtail uses Docker and CRI APIs to get mount points of containers, and save them in some structure like this (I simplified it):

[]struct {
    Path string
    Labels map[string]string
}

The “filelog receiver” will use them to find files, and add labels.

Describe alternatives you've considered

There is also a simple idea. We can get a mapping from Container ID or Overlay ID to Pod UID, then use k8sattributes. It works, but is tricky and seems limiting.

Additional context

No response

@h0cheung h0cheung added enhancement New feature or request needs triage New item requiring triage labels Mar 10, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

@h0cheung, you might find this example helpful. Please let us know if you think this is still lacking functionality.

@h0cheung
Copy link
Contributor Author

@h0cheung, you might find this example helpful. Please let us know if you think this is still lacking functionality.

The example is reading logs written to stdout/stderr.
The propose of this issue is reading logs written to files inside containers.

@jsirianni
Copy link
Member

@h0cheung, it sounds like the side car collector approach might be appropriate for this situation. I think an emptyDir volume could be used to store the application's log file. This volume could be mounted by the collector side car, and the log file could be read directly by the filelog receiver.

Additionally, the downward API could be used to inject the pod name and other useful attributes such as pod name, node name, and namespace name as environment variables, which can then be added as resource attributes for k8s.pod.name and so on.

@maokitty
Copy link

@jsirianni It seems downward API not work for daemonset collector?
there is some case people will install collector in daemonset kind , and collect logs in pod that do not output to stdout/stderr

@h0cheung
Copy link
Contributor Author

h0cheung commented Mar 10, 2023

@h0cheung, it sounds like the side car collector approach might be appropriate for this situation. I think an emptyDir volume could be used to store the application's log file. This volume could be mounted by the collector side car, and the log file could be read directly by the filelog receiver.

Additionally, the downward API could be used to inject the pod name and other useful attributes such as pod name, node name, and namespace name as environment variables, which can then be added as resource attributes for k8s.pod.name and so on.

Thanks. These solutions work. We can even parse Pod UID from the path of emptyDir.
However, to use these methods, we need to modify and restart deployments, which may be running online services. This makes the configuration hard for those who are not so familiar with Kubernetes, too. And sidecars use more resources than a daemonset.

That's why I opened this issue. It would be better if we can do this without affecting running services, and make it easy for users to get started.

@jsirianni
Copy link
Member

@jsirianni It seems downward API not work for daemonset collector? there is some case people will install collector in daemonset kind , and collect logs in pod that do not output to stdout/stderr

Correct, in my example I am suggesting the use of a sidecar collector, not a daemonset.

@h0cheung, it sounds like the side car collector approach might be appropriate for this situation. I think an emptyDir volume could be used to store the application's log file. This volume could be mounted by the collector side car, and the log file could be read directly by the filelog receiver.
Additionally, the downward API could be used to inject the pod name and other useful attributes such as pod name, node name, and namespace name as environment variables, which can then be added as resource attributes for k8s.pod.name and so on.

Thanks. These solutions work. We can even parse Pod UID from the path of emptyDir. However, to use these methods, we need to modify and restart deployments, which may be running online services. This makes the configuration hard for those who are not so familiar with Kubernetes, too. And sidecars use more resources than a daemonset.

That's why I opened this issue. It would be better if we can do this without affecting running services, and make it easy for users to get started.

Understood, and I agree. Just wanted to make sure all options were being considered.

@atoulme atoulme removed the needs triage New item requiring triage label Mar 10, 2023
@h0cheung
Copy link
Contributor Author

I think the solution of ilogtail is good. And I'd like to implement it. Any idea?

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 22, 2023
@h0cheung
Copy link
Contributor Author

A new component proposal has been opened: #23339

@dmitryax dmitryax removed the Stale label Jun 14, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Aug 14, 2023
@dmitryax dmitryax removed the Stale label Aug 14, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 18, 2023
@djaglowski
Copy link
Member

This issue, #23339, and #25251, seem to request roughly the same functionality. I am closing this and suggest we continue conversation on #23339. We can reopen this issue if there is some reason which cannot be covered by the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request processor/k8sattributes k8s Attributes processor receiver/filelog Stale
Projects
None yet
Development

No branches or pull requests

7 participants