Log ingestion has latency, which will be the time from when the data is created on a specific system to when the data becomes available for analysis. It’s important to decrease latency in order to effectively gather data for analysis.
There are various factors that can affect latency. First, we have agent time, which is the time from event discovery to retrieving the data into the database or data collection endpoint as a record. Next, we have pipeline time, which is the time for ingestion pipeline to process this log data. Lastly, we have indexing time, which is basically time spent to ingest into the storage. Note that resources can stop providing data due to some issues, there are numerous tools to help with detecting this like VMware’s heartbeat alarm which shows the status of communications and can detect when a system has locked up, crashed, or have ceased to function.
Agent time can variate depending on your strategy. There are different strategies for collecting data from systems which may affect latency, and the type of logs you may be collecting will affect this as well like resource logs or activity logs. Some solutions don’t even collect data from an agent, which can increase or decrease latency.
There are multiple parts that add to pipeline time like having custom logs. Depending on how it processes the custom logs, it will add time to logs that are collected by agents. Also, some solution has heavier algorithms and are stored in temporary storage, which further increases latency.
You can compare ingestion time to time generated property to measure latency of a specific record on tools like Microsoft azure. You can also use the result with aggregations to discover how ingestion latency behaves as well. Microsoft azure also gives you the flexibility to pull all sorts of data like using percentile to get insights for larger amounts of data.