Select Page
Rounding up syslog-ng 4 and a practical introduction to typing

Rounding up syslog-ng 4 and a practical introduction to typing

syslog-ng 4 is right around the corner and the work on the topics I listed in this blog post are nearing completion. Instead of a pile of breaking changes, we choose to improve syslog-ng in an evolutionary manner: providing fine grained compatibility with older versions along the way, so that syslog-ng 4 remains a drop-in replacement for any earlier release in the past 15 years.

What evolution in this context means in practice is that features/changes are merged into 3.x releases as we are ready with them, but they are all hidden behind a feature flag: they all come disabled by default and to enable them, one needs to use `@version: 4.0` at the top of one’s configuration file. This process is detailed in Peter Czanik’s blog post with a couple of real-world examples.

The first set of changes went into 3.36.1 released in March, some more followed in 3.37.1 (and a related blog post) released in June  and 3.38.1 is any day now with most changes accumulated already on the “master” branch (nightly snapshots).

Hopefully, this is going to be the last 3.x release before 4.x is cut, but this also depends on feedback and issues that we might encounter in this cycle.

This release was focused primarily to get 4.0 ready and as such it concentrated pretty much on finishing up the typing feature. If you have already read the linked blog post, you might already be aware that we intend to associate runtime type information to any name-value pair that we encounter, so that we can 1) allow type aware comparisons in routing decisions, 2) reproduce the original types when sending a log message to consumers. I also expect this to be an important feature as we implement more features in our long term objectives (observability, app awareness, user friendliness).

Problems that type aware comparisons attempt to solve

Probably the most important change in 3.38.1 is the introduction of type aware comparisons. Traditionally syslog-ng had two kinds of comparison operators, just like shell scripts with the “test” builtin command.

  1. one to compare strings (eq, ne, gt, lt)
  2. one to compare numbers (==, !=, >, <)

Here’s an example:

log {
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.response}" >= "500") {
    file("/logs/apache-errors");
  };
};

This example shows how to route HTTP 500 and above requests to a separate file. If you look at the if {} statement, you can see that we compared the HTTP response code to 500 and tried to capture anything that is higher than 500. But there’s a potential trap in here. ${.apache.response} is a string that contains a number. “500” in syslog-ng 3.x is also a string.

How this comparison is performed depends on the operator that we use: numeric (<, >, ==, !=) or string (lt, gt, eq, ne) focused.

If you look at the example again, you can see that we used the “numeric” operators, which means that the configuration above correctly performs the comparison, converts both “${.apache.response}” and “500” to numbers and compares them numerically.

But then, let’s see this example:

log {
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.httpversion}" == "1.0") {
    file("/logs/http-10-logs");
  };
};

Again, we are trying to compare a name-value pair against a literal string, this time checking for equality, and albeit version numbers are not strictly numeric, they are in this specific case. Also, using “==” as an operator as before.

But this example will do something that is pretty unexpected:

  • we used the numeric operators in the example, so syslog-ng would convert both “${.apache.httpversion}” and “1.0” to numbers
  • but the numeric operators only support INTEGERS, floating point numbers are not supported (actually syslog-ng uses the function atoi(3) for this conversion)
  • atoi() actually picks up the numbers before the “.” in the version number and converts that to an integer, this means “1.0” becomes 1 and “1.1” becomes 1 too!
  • so the comparison above would evaluate to TRUE to any value that starts with a 1 and then a non-digit character.
  • this means that all HTTP versions starting from 1.0 up to 1.9 end up in our file that we designated as one to hold 1.0 only traffic.

Not really the expected behaviour. But it becomes worse.

log {
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.request}" == "/wp-admin/login.php") {
    file("/logs/wordpress-logins");
  };
};

This time, we are trying to filter our data based on a string comparison, and we erroneously used the numeric operator. This is what happens:

  • Neither “${.apache.request}” nor “/wp-admin/login.php” is numeric, they don’t even have digits in front of them
  • Both values are converted to 0.
  • Zero equals to zero, so the filter expression above is always TRUE.

There are other similar cases, the ugliest one when comparing a name-value pair to an empty string with numeric operators. Results are completely unexpected.

Type aware comparisons come to the rescue

I saw numerous cases where someone got the operator incorrect when trying to compare/match something in syslog-ng. I felt that the issue has never been a user error, rather we made a poor job of providing a user friendly syntax and thus we pushed too much responsibility on those attempting to make use of these features.

But solving these kind of design mistakes is never easy. Some of our users have figured this out already. We don’t want to break their configuration, right? But we want to make this easier, more intuitive for new users or new use-cases.

The solution we implemented was to make the numeric operators (==, !=, <, >) do the right thing. Based on the types of its arguments, it can in most cases infer what would be the right thing to do. So let’s help them there. We took some inspiration from JavaScript (which operates in a similar string-heavy environment) and implemented more intuitive rules for our – previously numeric only – comparisons.

Let’s see our previous examples:

@version: 4.0
log {  
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.response}" >= 500) {
    file("/logs/apache-errors"); 
  };
};

If you compare this to my previous example, I have removed the quotes from around “500”. In syslog-ng 3.x, the quotes were mandatory. In 4.x, they are not. If you are not using quotes, the literal 500 becomes a numeric literal. And comparing a string to a number would compare those as numeric (e.g. just like JavaScript). We could even improve the Apache parser to make ${.apache.response} a number as it parses its input, but to do the right thing, it is enough that one side of the comparison is numeric.

Next example:

@version: 4.0
log {
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.httpversion}" == "1.0") {
    file("/logs/http-10-logs");
  };
};

I haven’t changed anything in this example, both “${.apache.httpversion}” and “1.0” are strings and they are compared as strings. So this time, only HTTP/1.0 would be routed to our logfile. 1.1 or even 1.9 would be  filtered out, as expected. We could use floating point based comparisons if we wanted to by removing the quotes (just like in the previous example) or by using explicit type-casting:

if ("${.apache.httpversion}" == 1.0)

OR
if (double("${.apache.httpversion}") == "1.0")

Type casting can be applied anywhere where we used template strings before to apply a type to the result of the template expansion.

And here’s the third example:

@version: 4.0
log {
  source { file("/var/log/apache2/access.log"); };
  parser { apache-accesslog-parser(); };
  if ("${.apache.request}" == "/wp-admin/login.php") { 
    file("/logs/wordpress-logins"); 
  }; 
};

Again, no changes necessary. Both sides are strings, we are comparing as strings. No need to use the “eq” operator. Just one set of operators and sometimes explicit type-casts will cover all use-cases. For compatibility reasons, the old “string” operators (eq, ne, lt, gt) remain to be available, but I hope we can forget those eventually.

Other typing related changes

This section briefly lists the various components that we needed to adapt to typing. These changes happened since 3.36.1 was released but not explicitly announced in those versions. Let me know if you are interested in any of these topics in more detail, probably there are a couple of blog posts worth of content here:

  • type aware comparisons in filter expressions: as detailed above, the previously numeric operators become type aware and the exact comparison performed will be based on types associated with the values that we compare.
  • json-parser() and $(format-json): JSON support is massively improved with the introduction of types. For one: type information is retained across input parsing->transformation->output formatting. JSON lists (arrays) are now supported and are converted to syslog-ng lists so they can be manipulated using the $(list-*) template functions. There are other important improvements in how we support JSON.
  • set(), groupset(): in any case where we allow the use of templates, support for type-casting was added and the type information is properly promoted.
  • db-parser() type support: db-parser() gets support for type casts, <value> assignments within db-parser() rules can associate types with values using the type-casting syntax, e.g. <value name=”foobar”>int32($PID)</value>. The “int32” is a type-cast that associates $foobar with an integer type. db-parser()’s internal parsers (e.g. @NUMBER@) will also associated type information with a name-value pair automatically.
  • add-contextual-data() type support: any new name-value pair that is populated using add-contextual-data() will propagate type information, similarly to db-parser().
  • map-value-pairs() type support: propagate type information
  • SQL type support: the sql() driver gained support for types, so that columns with specific types will be stored as those types.
  • template type support: templates can now be casted explicitly to a specific type, but they also propagate type information from macros/template functions and values in the template string
  • value-pairs type support: value-pairs form the backbone of specifying a set of name-value pairs and associated transformations to generate JSON or a key-value pair format. It also gained support for types, the existing type-hinting feature that was already part of value-pairs was adapted and expanded to other parts of syslog-ng.
  • on-disk serialized formats (e.g. disk buffer/logstore): we remain compatible with messages serialized with an earlier version of syslog-ng, and the format we choose remains compatible for “downgrades” as well. E.g. even if a new version of syslog-ng serialized a message, the old syslog-ng and associated tools will be able to read it (sans type information of course)
syslog-ng 3.37 released

syslog-ng 3.37 released

syslog-ng 3.37 has just been released, packages available in various platforms this week. You can get the detailed release notes on the github releases page, however I felt this would be a good opportunity to revisit my draft on the syslog-ng long term objectives and how this release builds in that direction.

The Edge: deployment and CI/CD

Being better at the edge means that we need to improve support for use-cases where syslog-ng is directly deployed on the node/server or is deployed close to such nodes or servers. One way to deploy syslog-ng is to use a .deb or .rpm package, but more and more syslog-ng is used in a container. Our production docker image is built based on Debian. Creating this image has been a partially manual process with all the issues that this entails.

With the merge of PR #4014 and #4003, Attila Szakács automated the entire workflow in a beautiful set of GitHub Action scripts, so that:

  • Official source and binary packages (for CentOS, Debian, Fedora and Ubuntu) are built automatically, once a syslog-ng release is tagged
  • The production docker image is built and pushed automatically, once the required binary packages are successfully built.

While we have pretty good, automated unit and functional tests, we did not test the installation packages themselves. Until now. András Mitzky implemented a smoke tests for the packages themselves, doing an install & upgrade and a start-stop test.

The Edge: Kubernetes

Increasingly, the edge is often running on an orchestrated, container based infrastructure, such as Kubernetes. Using syslog-ng in these systems were possible but required manual integration. With the merger of PR #4015, this is becoming more out of the box, expect another blog post on this in the coming days.

Application awareness

syslog is used as an infrastructure for logging serving a wide variety of applications. For these applications, logging is not a primary concern, unfortunately. The consequence is that they often produce invalid or incorrect data. To handle these applications well, we need to cater for these issues.

For instance, certain Aruba products use a timestamp like this:

2022-03-10 08:04:08,449

Looking at this, the problem might not even be apparent: it uses a comma to separate seconds from the fractions part.

You might argue that this is not an important problem at all, who needs fractions anyway?

There are two issues with this:

  1. Fractions might be important to some (e.g. for ordering with thousands of logs per second).
  2. It breaks the parsing the message itself (as the timestamp is embedded in a larger message), causing message related metadata to be incorrectly extracted (e.g. which device you want to attribute this message). This means that your dashboard in a SIEM may miss vital information.

And this is not the only similar case. See for this pull request for example for a similar example.

This is exactly why application awareness is important, fixing these cases means that your log data becomes more usable as a whole.

Usually it is not the programming of the solution that is difficult here, rather the difficulty lies in having to learn that the problem exists in the first place. If you have a similar parsing problem, please let us know by opening a GitHub issue. The past few such problems were submitted to us by the Splunk Connect for Syslog team, thanks for their efforts. Btw, sc4s is great if you want to feed syslog to Splunk and it uses syslog-ng internally.

On a similar note, we have improved the cisco-parser() that extracts fields from Cisco gear and added a parser for MariaDB audit logs. Both of these parsers are part of our app-parser() framework.

Others

There are a few other features I find interesting, just a short summary

  • Type support is nearing completion. We added support for types in template expressions, groupset() & map-value-pairs().
  • We improved syslog-ng’s own trace messages: we added the unique message ID (e.g. $RCPTID) as a tag in all message related trace messages, so that you can correlate trace messages to a specific message. We also included type information as a part of the type support effort.
  • We improved handling of list/array like data in this pull request.
  • We extended our set of TLS options by adding support for sigalgs & client-sigalgs.

 

syslog-ng on the long term: a draft on strategic directions

syslog-ng on the long term: a draft on strategic directions

I made a promise some posts ago that I would use this blog both for collecting feedback and to provide information about potential next steps ahead of syslog-ng. In the same post, I also promised that you, the syslog-ng community, would have a chance to steer these directions. Please read on to find out how to do that.

In the past few weeks I performed a round of discussions/interviews with syslog-ng users. I also spent time looking at other products and analyst reports on the market. Based on all this information I’ve come up with a list of potential strategic directions for syslog-ng to tackle. Focusing on these and prioritizing features that fall into one of these directions ensures that syslog-ng indeed moves ahead.

When I performed similar goal setting exercises in my previous CTO role at Balabit, our team made something similar:

  1. brainstorming on potential directions,
  2. drafting up a cleaned up conclusion document,
  3. validating that the document is a good summary of the discussion and
  4. validating via customers that they are indeed a good summary of what the customers need.

syslog-ng is an Open Source project, so I wanted to involve the community somehow. Organizing a brainstorming session sounds difficult on-line (do you know good solutions for this?). So I wanted to create an opportunity to talk with the broad community about my thoughts somehow, in a way that leads to a useful conclusion. This is the primary intent behind this post.

Once you read the directions below, please think about if you agree with my choice of directions here! Are these indeed the most important things? Have I missed something? Do you have something in mind that should be integrated somehow? Which of the directions do you consider the most important?

Please give your feedback via this form https://forms.gle/xJ2heSHeVb7ZHUHH9, write a comment  on the blog or drop me an email. Thanks.

1. The Edge

syslog-ng has traditionally been used as a tool for log aggregation, e.g. working on the server side. That’s why its CPU and memory usage has always been in focus. Being able to consume a million (sometimes millions!) of messages a second is important for server use-cases, however I think that in exchange for this focus, syslog-ng has neglected the other side of the spectrum: the Edge.

The Edge is where log messages are produced by infrastructure and applications and then sent away to a centralized logging system.

syslog-ng trackles the original “syslogd-like” deployment scenarios on the Edge, but lacks features/documentation that make it easy to deploy it in a more modern setting, e.g. as a part of a Kubernetes cluster or as a part of a cloud-native application.

Apart from the deployment questions, I consider The Edge to be also important for improving data quality and thus improving the usefulness of collected log data. I see that in a lot of cases today, log data is collected without associated meta-information. And without that meta information it becomes very difficult to understand the originating context of said log data, limiting the ability to extract insights and understanding from logs.

These are the kind of features that fall into this bucket, in no particular order:

  • Transport that is transparently carrying metadata as well as log data, plus multi-line messages (this is probably achieved by EWMM already)
  • Kubernetes (container logs, pod related meta information, official image)
  • Document GCP/AWS/Azure deployments, log data enrichment
  • non-Linux support (Windows and other UNIXes)
  • Fetch logs from Software as a Service products
  • etc

2. Cloud Native

The cloud is not just a means to deploy our existing applications to a rented infrastructure. It is a set of engineering practices that make developing applications faster and more reliable. Applications are deployed as a set of microservices, each running in its own container, potentially distributed along a cluster of compute nodes. Components of the applications managed via some kind of container orchestration system, such as Kubernetes.

Being friendly to these new environments is important, as new applications are increasingly using this paradigm.

Features in this category:

  • Container images for production
    • as a logging side-car to collect app logs and transfer them to the centralized logging function or
    • as an application specific, local logging repository (e.g. app specific server)
  • HTTP ingestion API
    • these apps tend to communicate using HTTP, so it is more native to use that even for log ingestion
    • maybe provide compatibility with other aggregation solutions (Elastic, Splunk, etc)
  • Object Storage support
  • Stateless & persistent queueing (kafka?)
  • etc

3. Observability

The term observability roots in control theory, however it is increasingly applied to the operations of IT systems. Being observable in this context means that the IT system provides an in-depth view into its inner behaviours, making it simpler to troubleshoot problems or increase performance. Observability today often implies three distinct types of data: metrics, traces and logs.

I originally met this term in relation to Prometheus, an Open Source package that collects and organizes application specific metrics in a manner that easily adapts to cloud native, elastic workloads. Traditional monitoring tools (such as Zabbix or Nagios) require a top-down, manual configuration, while Prometheus reversed this concept and pushed this responsibility to application authors. Applications should expose their important metrics so that application monitoring works “out-of-the-box”. This idea quickly gained momentum as manually configuring monitoring tools to adapt automatically scaled application components is pretty much impossible.

Albeit observability originally comes from the application monitoring space, its basic ideas can be extended to cover traces and logs as well.

Features in this category:

  • Being observable: provide a prometheus exporter so that we can become observable out-of-the-box
  • Interoperate with Observability platforms
    • Loki destination
    • Support for OpenTelemetry (source and destination)
    • convert logs from metrics/traces and vice-versa

4. Application awareness

syslog has been a great invention: it has served us in the last 40-45 years and its importance continues into the future. Operating systems, network devices, IoT, applications, containers, container orchestration systems can all push their log data to syslog. For some of those, using syslog is the only option.

In a way syslog is the common denominator of all log producing IT systems out there and as such it has become the shared infrastructure to carry logs in a lot of environments.

In my opinion, the success of syslog stems from the simplicity of using it: just send a datagram to port 514 and you are done. However this simplicity is also its biggest limitation: it is under-specified. There have been attempts at standardization (RFC3164 and RFC5424) but these serve more as “conventions” than standards.

The consequence is that incompatible message formats limit the usefulness of log data, once collected in a central repository. I regularly see issues such as:

  • unparseable and partial timestamps
  • missing or incorrect timezone information
  • missing information about the application’s name (e.g. $PROGRAM) or hostname
  • incorrectly framed multi-line messages
  • key=value data that is in a format downstream systems are unable to parse

Sometimes it’s not the individual log entry that is the problem, rather the overly verbose logging format that becomes difficult to work with once you start using it for dashboards/queries:

  • The Linux audit system produces very verbose, multi-line logs about a single OS operation
  • Mail systems emit multiple log entries for a single email transaction, sometimes a separate log entry for each attachment.
  • etc

syslog-ng has always been good in the various heuristics to properly extract information even from incorrectly formatted syslog messages, however there are extreme cases where applications omit crucial information or use a syntax so far away from the spec that even syslog-ng is unable to parse the data correctly.

Application awareness in this context means the ability of fixing up the syslog parsing with the knowledge of the application that produced it. It is difficult to craft heuristics that work with all incorrect formats, however once we start with identifying the application, then we can correctly determine what the log message was intended to look like. Fixing these issues before the message hits a consumer (e.g. SIEM) helps a lot in actually using the data we store.

Also, being application aware also implies that log routing decisions can become policy aware. “Forward me all the security logs” is a common request from any security department. However actually doing this is not simple: what should constitute as “security”? Being application aware means that it becomes possible to classify based on applications instead of individual log messages.

Features in this category:

  • classifying incoming logs per application (e.g. app-parser() and its associated application adapters)
  • fix incoming logs and make them formatted in a way that becomes easier to handle by downstream consumers (timestamps, multi-line messages, etc.)
  • translate incoming logs into a format that a downstream system best understands

5. User friendliness

syslog-ng is a domain specific language for log management. Its performance is a crucial characteristic, but the complexity of operations performed by syslog-ng, still within the log management layer has grown tremendously. Making syslog-ng easier to understand, errors and problems easier to diagnose is important in order to deal with this complexity. Having first class documentation is also important for it to succeed in any of these directions, described above.

So albeit not functionality by itself, I consider User friendliness a top-priority for syslog-ng.

Features in this category:

  • syntax improvements can go a long way of adopting a feature. syslog-ng has always been able to do conditional routing of log messages however if()/elif()/else went a long way in getting it adopted. There are other potential improvements in the syntax that could help reading/writing syslog-ng configurations easier.
  • configuration diagnostics: better location reporting in error messages, warnings, etc.
  • interactive debuggability: as syslog-ng is applied to more complex problems, the related configuration becomes more complex too. Today, you have to launch syslog-ng in foreground, inject a message and try to follow its operations using the builtin trace messages. Interactive debugging would go a long way in making the writing and testing these functionalities.

Those are roughly the directions I have in mind for the future of syslog-ng. If you disagree or have some comments, please provide feedback via the form at: https://forms.gle/xJ2heSHeVb7ZHUHH9

syslog-ng 4 theme: typing

syslog-ng 4 theme: typing

As explained in my previous post, we do have some features already in mind for syslog-ng 4, even though the work on creating a long term set of objectives for the syslog-ng project is not finished yet. One of the themes I that I have working code for already, is typing.

syslog-ng traditionally assumes that log data, even if it comes in a structured form (like RFC5424 structured data or JSON) is primarily textual in nature. For this reason, name-value pairs in syslog-ng are text values just as the log message as a whole. The need for typing however came up previously, most notably in cases where we sent data to a consumer that supported typing, such as:

  • Elastic like other similar consumers use JSON, and attributes can have non-text types
  • SQL columns have types
  • Riemann metrics can have types

Also, it happens that typing has an impact in log routing decisions. In a lot of cases, textual comparisons or regexp matches are fine, however sometimes your routing condition depends on a value being larger than or less than a numeric value. For example:

log {
   if ("${.apache.bytes}" > "10000") {
      # do something
   }
};

In this case, doing the comparison as texts is clearly incorrect, if ${.apache.bytes} was “5”, the condition above would pass, as the string “5” is larger than “10000”, which is clearly not the case if we were to compare these numerically. To allow both numeric and textual comparisons, syslog-ng has two sets of operators, the usual “<“, “=” and “>” are doing numeric comparisons, while “lt”, “eq” and “gt” are doing string comparisons. But it’s pretty easy to mix those up, even I make that mistake sometimes.

To address both problems, type support is being added to syslog-ng. The change by itself is pretty simple:

  • we add a “type” value associated with each name-value pair of the log message,
  • the value itself continues to be stored internally in their current, text based format,
  • whenever we need type information in a type aware context (e.g. when we format  a JSON or send a riemann event), we would use this type information
  • whenever we just need the name-value pair as before, in textual context, we would just continue to use the existing string based value

The consequences:

  • type aware consumers (like: JSON, Elastic, Riemann, MongoDB, etc) would use type information automatically, no need for explicit type hints
  • we can implement type aware comparisons, so that syslog-ng does the right comparison, based on types (e.g. like JavaScript).

As always, this is probably easier to understand with examples.

Type aware JSON parsing/reproduction

@version: 4.0
log {
  source { tcp(port(2000) flags(no-parse)); };
  parser { json-parser(prefix('.json.')); };
  destination { file("/tmp/json.out" template("$(format-json .json.* --shift-levels 2)\n")); };
};

This configuration expects JSON payloads, one by each line, on TCP port 2000. It parses the JSON and then reformats it using $(format-json). Let’s run this configuration:

$ /sbin/syslog-ng -Fedvtf /etc/syslog-ng/syslog-ng-typing-demo.conf

Let’s send a JSON payload to this syslog-ng instance:

$ echo '{"text": "string", "number": 5, "bool": true, "thisisnull": null, "list": [5,6,7,8]}' | nc -q0 localhost 2000

syslog-ng reports the parsing process in its debug/trace log levels:

[2022-03-03T08:40:56.408225] json-parser message processing started; input='{"text": "string", "number": 5, "bool": true, "thisisnull": null, "list": [5,6,7,8]}', prefix='.json.', marker='(null)', msg='0x7ffff00141c0'
[2022-03-03T08:40:56.408461] Setting value; name='.json.text', value='string', msg='0x7ffff00141c0'
[2022-03-03T08:40:56.408500] Setting value; name='.json.number', value='5', msg='0x7ffff00141c0'
[2022-03-03T08:40:56.408524] Setting value; name='.json.bool', value='true', msg='0x7ffff00141c0'
[2022-03-03T08:40:56.408545] Setting value; name='.json.thisisnull', value='', msg='0x7ffff00141c0'
[2022-03-03T08:40:56.408592] Setting value; name='.json.list', value='5,6,7,8', msg='0x7ffff00141c0'

Note the individial name-value pairs being set as they are extracted from the JSON format. And then this is reproduced on the output side:

{
  "thisisnull": null,
  "text": "string",
  "number": 5,
  "list": [
    "5",
    "6",
    "7",
    "8"
  ],
  "bool": true
}

Please note that “numer” is numeric and “list” contains a JSON list. One limitation that is still visible here is that list elements are not typed and are always strings when being reproduced using $(format-json), because list elements are not name-value pairs.

Associate type information with name-value pairs

It is not just JSON that can set types for name-value pairs, rewrite rules and db-parser() can also set them. In rewrite rules, set() can now take a type hint, and that type-hint gets associated with the value as its type:

#this makes $PID numeric
rewrite { set(int("$PID") value("PID")); };

Also, db-parser() would set type information depending on which parser we used extract the specific field. For instance @NUMBER@ would extract an integer.

Type information returned by macros and templates and template functions

Template functions will be able to return the type, depending on the function they perform. For instance the list handling functions like $(list-slice) would return a list. Numerical functions like $(+) would return numbers. Likewise, some macros are also being annotated with their types.

Template expressions as a whole also become typed, whenever we use an “simple” template expression (e.g. with just one ‘$’ reference, like “$PID”), the type of the template is inferred automatically and that type is propagated. If the inferred type is not correct, you can always use type-hints to “cast” the template expression to some other type.

When does it become available? When I can try it?

Since the typing behavior has the potential of changing the output in certain ways (e.g. produce a numeric value which used a string before), we are not turning this feature on automatically. As long as we are in the 3.x release train, it will stay disabled, even as parts of it are being merged. You can evaluate the feature by setting your config version (e.g. @version at the top of the config file), to 4.0, as shown with the example config above.

Then, as we release 4.0, the typing feature will be enabled by default for any configuration that uses @version: 4.0.

Most of the feature is already implemented, but not yet merged to the mainline yet. There are opened PRs on GitHub. 3.36 is expected to contain the first batch (e.g. JSON parser pieces), but not the complete change. I expect the changes to land in mainline in 1 or 2 extra release cycles, e.g. the end of April or end of June.

Stay tuned!

syslog-ng future: the path to syslog-ng 4

syslog-ng future: the path to syslog-ng 4

syslog-ng 3.0.1 was released 17th February 2009, almost exactly 13 years ago. The key feature at that point was to add support for RFC5424, the new “syslog” protocol. The 3.0 release marked a significant conceptual change in syslog-ng as this was where we introduced support for generic “name-value pairs”, a means to encode application or organization specific fields (aka name-value pairs as we named them) associated with a log message.

The 3.x release train has been a long and a busy one. We are right now at 3.35.1 with 3.36.1 right around the corner. Not counting bugfix releases, that’s ~4 releases per year on average. This pace was slower initially (~1 release/year) which then increased due to all the engineering practices that we implemented in the last decade: syslog-ng is a very well tested application today, covered both in terms of unit tests and functional, end-to-end testing. In the last years, the syslog-ng project has produced 5-6 releases per year (every ~2 months), in a rolling model. Apart from features and bugfixes we also had a sharp focus on compatibility and avoiding regressions.

When I started to draft this post, I compiled a list of noteworthy features that were created since 3.0.1 in 2009. My intention with the list was to include it here to back up my previous claim that there are lot of undiscovered and under-communicated aspects of syslog-ng. However, when I finished with the list, I had to realise that even if I trim it down, it is still too long to discuss it in a blog post at one go. For now, I’ve uploaded my raw notes here. I am probably going to use that list to publish technology pieces on the blog or create a survey to map out which are the more interesting items to syslog-ng users. I don’t know yet.

This post however, is not about the past, the title says it all: it is about the path to syslog-ng 4. With the relaunch taking place, I was thinking what else could be better to symbolize a restart than a new major version? With that we can take a moment to reflect on the 3.x series and start anew with fresh energy.

It is very important to state that syslog-ng 4 is not the revolutionary, break-everything kind of release that we see too often in the software world. Rather it is an evolutionary change that will be produced similarly to previous releases, that is:

  • the release will contain both features and bugfixes
  • if a change in behaviour is unavoidable, we keep being compatible using the config version mechanism, e.g. the “@version:” tag in the front of the config file
  • compatibility with old config versions are retained long term (e.g. we are compatible back to 3.0, with compatibility back to 2.0 dropped just a couple of years ago)

But why the fuzz, you may ask, about a new version number if nothing changes and we do exactly as before?

Well, there are some plans scheduled for 4.0 (more on those later), but I consider this release to be an opportunity to set up new, long term objectives. Objectives that will cover the upcoming releases as well and not just 4.0 itself. With the launch of this blog and through interactions with the community, I already have some thoughts of my own, still, I would like to allow community members to contribute even on the strategic level. Let’s find the mission statement for syslog-ng that covers the next 10 years and then guide the project towards those goals with a step in each release. I am posting the specifics and the mechanism of this work in an upcoming post. Until that post, please continue to send me feedback (via Email, gitter.im, GitHub, Reddit, LinkedIn whatever you like), I am truly enjoying each and every one of these interactions and make an effort to respond to all your queries. Also, the syslog-ng project started to use GitHub’s discussion feature, so if you have a suggestion with regards to syslog-ng 4, feel free to submit it here.

Release management and Support

So how would the release of 4.0 happen? Is this a new branch over 3.x? How long would we support 3.x?

These are all valid questions, however the answer is simple: syslog-ng 4 is nothing more than a 3.x release in this respect. We will add features and bugfixes and compatibility will be provided using the config version feature (ie. @version). We will make no breaking changes that we cannot continue to be compatible with. There will be no separate 3.x and 4.x releases going in parallel. If we break something, fixes would be pushed out in upcoming versions (either the scheduled one or an emergency one if the problem is critical). We are confident that our current test coverage gives us a safety net that allows us to use this release strategy.

At the same time, we are scheduling some larger-scale changes that will probably not fit into a normal 8 week release cycle we do these days. We don’t want to stop doing our 3.x releases and we don’t want to publish half-baked features. So how are we going to resolve this conflict?

The regular bugfix/feature flow of 3.x will continue to operate as before. Any 4.0 related functional change will be merged to master (and thus make it into 3.x releases) but any functional change will be disabled.

Once all 4.0 related changes are merged, a 4.0.1 release will be created, effectively turning on the new behaviours, except if the user operates in `@config: 3.x` mode, which is the usual method  to tell syslog-ng to operate in compatibility mode.

All of this basically means the following:

  • the 3.x feature and bugfix flow operates as normal
  • the 4.x related changes get merged and can be evaluated if someone is interested (by using “@version: 3.255” at the top of your configuration file)
  • no half-baked functionality is exposed, even if they take longer to bake than the 8 week release cadence.
  • all protected by our testing infrastructure

Up until now, only the versioning framework was merged with some more queued for merging. Details on some of the plans for 4.0 are coming in separate posts. Stay tuned!

syslog-ng 3.37 released

syslog-ng distribution and support bottleneck

I find that a lot of syslog-ng deployments are lagging behind and are using ancient versions. It has become difficult for me to get these deployments to more recent versions. No product is able to improve and cover new ground in a situation like this…

Being ancient is a relative term: for instance, in the JavaScript world it is considered ancient if you are using a framework that was initially released two or more years ago. New hypes and incompatible rewrites are published at a pace which makes the JavaScript ecosystem difficult to follow.

Maintaining this change velocity in the log management space is not feasible. Deploying a log management and processing infrastructure from scratch can literally take years just one time. Swapping out technologies every now and then on a whim would mean that the project never reaches the goals it was set out to achieve.

With that I said I still think that being able to regularly push out updates to deployments is an important bottleneck to solve for any product to be sustainable. This is needed for both the feature front (e.g. addressing new use-cases) and on the support perspective (e.g. fixing bugs).

I often get questions about syslog-ng 3.5.6. This release was originally published 5th August 2014, roughly 8 years ago, and happens to be part of EPEL7. syslog-ng is included in BMW i3 vehicles, this video shows the listing of open source components on the infotainment screen, The BMW Open Source DVD contains syslog-ng 3.4.7, a whooping fresh release from December 2013. There are similar stories with syslog-ng included in products or an OS release, usually with pretty old versions.

Why does this happen?

Due to the early adoption of syslog-ng, it was included in a number of Linux distributions and BSDs/UNIXes, even became default in some of them. I considered this a great success.

For none of these distributions however is log management a central question. They each need some kind of log daemon, but that’s it. Whether that log daemon is syslogd, rsyslog or syslog-ng does not really matter. Neither matters their actual version number. So even though distributions helped initial syslog-ng adoption, they have become a bottleneck in delivering new releases to users.

Users can still upgrade, right?

Enterprise users (and products that embed Linux and syslog-ng) pick an OS version and plan with it for ~10 years. Unfortunately they deploy syslog-ng as a part of the OS and expect the OS vendor to provide support. Often, the sysadmin responsible for log management is not even allowed to upgrade. Some claim that upgrading syslog-ng would violate their support terms, causing the entire OS to become unsupported.

So even though more recent versions of syslog-ng includes functionality or fixes they need, they stick to the old version and try to work around any issues they find.

The support from the OS vendor for the logging component is questionable at best and is restricted only for the most basic use-cases, not cases where syslog-ng would play an important role in one’s infrastructure. Just as log management is not a central focus for the OS, neither is it for the support team behind the OS. They would fix security issues, should they be reported, but otherwise they will just continue to use what they have.

Solution: state of the art binaries to pick from

Building the latest version of syslog-ng for your enterprise distro on your own is not for the faint of heart. Even though 20 years ago, building your own kernel or application was an essential part of a sysadmin’s job on any UNIX, this is not true any more.

Also, it was a lot easier to build syslog-ng in 2001, today we have so many integrations that pulling all the build dependencies (and the right versions) is far from trivial.

We worked hard in the past years to resolve this issue and today syslog-ng is not only available in source format. There are a number of options today to pick from, should you want to use the latest and greatest:

Over time, the building of bespoke/customized packages has become much easier too, this blog post explains it all.

So what’s your excuse? I am really interested if the options above suffice. Do you still use an old syslog-ng version? Why? Would any of the above work for you? If not, What would YOU need to upgrade syslog-ng to recent versions? And what would you need to change your processes to plan for upgrades regularly?

If you have a response to any of these questions, please post it as comment below or drop me an email. Thanks.