Knowledge · Logs

    Reference~14 min read

    Log timestamp analysis

    Quick answer: Parse every log timestamp into an explicit instant (UTC epoch seconds or milliseconds) before correlation. Extract with regex or structured parsers, normalize timezone offsets, and reject ambiguous local times during DST fall-back. Store both @timestamp and the original raw string for audit replay.

    Common log timestamp formats

    Nginx access logs default to time_local — enclosed in brackets without timezone — which is dangerous for global fleets because it is ambiguous during repeated local times. Prefer time_iso8601 in new configurations. Apache combined logs similarly emit [10/Oct/2000:13:55:36 -0700] where the offset is explicit — much safer. Syslog (RFC 5424) includes precision timestamps with explicit offset or Z. JSON logs from Node often emit Date.toISOString() or numeric epoch — always check which.

    SourceExampleRegex / notes
    nginx time_iso86012026-04-22T14:05:01+00:00\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\+\d{2}:\d{2}
    Apache combined[22/Apr/2026:14:05:01 +0000]\[([^\]]+)\]
    JSON ISO"ts":"2026-04-22T14:05:01.123Z""ts":"([^"]+)"
    JSON epoch ms"ts":1713794701123"ts":(\d{13})

    Normalizing to UTC and detecting timezones

    If the log line includes a numeric offset, parse it and convert to UTC before indexing. If only local wall time exists, you must inject the known IANA zone of the emitting host — never guess from the reader's laptop clock. For multi-line stack traces, propagate the header timestamp downward or reject lines without context rather than inventing synthetic times.

    Awk and grep pipelines

    # Apache-style bracketed time — split on brackets (GNU/BSD awk)
    awk -F'[][]' 'NF>1 { print $2 }' access.log | head
    
    # Lines starting with ISO-8601 date
    grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}T' structured.log

    Python parsing sketch

    from datetime import datetime, timezone
    
    def parse_apache_ts(chunk: str) -> int:
        # 22/Apr/2026:14:05:01 +0000
        dt = datetime.strptime(chunk, "%d/%b/%Y:%H:%M:%S %z")
        return int(dt.astimezone(timezone.utc).timestamp())

    Key takeaways

    • Bracketed local times without offsets are a tech-debt magnet — reconfigure emitters when possible.
    • Always keep raw substrings until parsers are fuzz-tested against daylight saving edges.
    • JSON logs mix string ISO and numeric epoch — enforce schema in CI.
    • Use UTC in search indices; keep source offset metadata for legal holds.
    • For high volume, compile regex once and stream-parse rather than loading whole files.

    Written by Unix Calculator Editorial Team — Senior Unix/Linux Engineers. Last verified May 2026.

    Log parser tool

    Get the Unix Timestamp Cheatsheet

    One email. Instant cheatsheet. No drip sequence.

    Advertisement