Adding Processing Timestamp and Hostname in Logstash Using Ruby

When processing event streams with Logstash, it can be useful to record the time an event was processed and the hostname of the Logstash server handling it. This information is important for debugging, monitoring, and ensuring the traceability of event data.
Logstash makes this task straightforward using the Ruby filter plugin, which allows embedding Ruby code to manipulate event data.

Scenario

Let’s consider a situation where you need to:

  1. Add a field that captures the exact time Logstash processed an event.
  2. Record the hostname of the Logstash server handling the event.

Solution

You can use the Ruby filter in your Logstash pipeline configuration as follows:
ruby {
    init => "require 'socket'"
    code => "
        event.set('[receipt0][time]', LogStash::Timestamp.new(Time.now))
        event.set('[receipt0][hostname]', Socket.gethostname)
    "
}

Explanation of the Code

Initialization (init):

  • The require 'socket' statement loads Ruby’s Socket library, which provides methods to retrieve the hostname of the server.
  • The init block runs once during the initialization of the Logstash pipeline.

Timestamp (Time.now):

  • Time.now fetches the current time from the server.
  • LogStash::Timestamp.new converts this time into a format compatible with Logstash and Elasticsearch.

Hostname (Socket.gethostname):

  • Socket.gethostname retrieves the hostname of the Logstash server processing the event.

Event Fields:

  • The code adds two fields to each event under the receipt0 object:
  • receipt0.time: The processing timestamp.
  • receipt0.hostname: The server hostname.

Pipeline Configuration Example

Here’s how the Ruby filter might be integrated into a complete pipeline:

input {
    beats {
        port => 5044
    }
}

filter {
    ruby {
        init => "require 'socket'"
        code => "
            event.set('[receipt0][time]', LogStash::Timestamp.new(Time.now))
            event.set('[receipt0][hostname]', Socket.gethostname)
        "
    }
}

output {
    elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "logstash-receipts"
    }
    stdout { codec => json }
}

Result

Each processed event will now include additional metadata:
{
  "message": "Sample log message",
  "receipt0": {
    "time": "2024-11-26T10:15:30.000Z",
    "hostname": "logstash-server-1"
  }
}

Benefits

  1. Traceability:
    • The time field ensures you know exactly when an event was processed.
    • The hostname field identifies the specific Logstash server in distributed environments.
  2. Debugging:
    • Quickly diagnose delays or bottlenecks by analyzing processing timestamps.
  3. Monitoring:
    • Use the added metadata for performance monitoring or server health checks.

Additional Notes

  • Field Names: Replace [receipt0][time] and [receipt0][hostname] with field names that align with your data schema, if necessary.
  • Time Zone: The Time.now method uses the server’s local time zone. If you need UTC, you can use Time.now.utc.
  • Performance: The Ruby filter is lightweight, but avoid overloading it with excessive computations in high-throughput environments.
By adding this Ruby code to your Logstash pipeline, you enhance the traceability and observability of your event processing, making your Elasticsearch environment more robust and transparent.

The post Adding Processing Timestamp and Hostname in Logstash Using Ruby appeared first on SOC Prime.