Select Page
syslog-ng 3.37 released

syslog-ng 3.37 released

syslog-ng 3.37 has just been released, packages available in various platforms this week. You can get the detailed release notes on the github releases page, however I felt this would be a good opportunity to revisit my draft on the syslog-ng long term objectives and how this release builds in that direction.

The Edge: deployment and CI/CD

Being better at the edge means that we need to improve support for use-cases where syslog-ng is directly deployed on the node/server or is deployed close to such nodes or servers. One way to deploy syslog-ng is to use a .deb or .rpm package, but more and more syslog-ng is used in a container. Our production docker image is built based on Debian. Creating this image has been a partially manual process with all the issues that this entails.

With the merge of PR #4014 and #4003, Attila Szakács automated the entire workflow in a beautiful set of GitHub Action scripts, so that:

  • Official source and binary packages (for CentOS, Debian, Fedora and Ubuntu) are built automatically, once a syslog-ng release is tagged
  • The production docker image is built and pushed automatically, once the required binary packages are successfully built.

While we have pretty good, automated unit and functional tests, we did not test the installation packages themselves. Until now. András Mitzky implemented a smoke tests for the packages themselves, doing an install & upgrade and a start-stop test.

The Edge: Kubernetes

Increasingly, the edge is often running on an orchestrated, container based infrastructure, such as Kubernetes. Using syslog-ng in these systems were possible but required manual integration. With the merger of PR #4015, this is becoming more out of the box, expect another blog post on this in the coming days.

Application awareness

syslog is used as an infrastructure for logging serving a wide variety of applications. For these applications, logging is not a primary concern, unfortunately. The consequence is that they often produce invalid or incorrect data. To handle these applications well, we need to cater for these issues.

For instance, certain Aruba products use a timestamp like this:

2022-03-10 08:04:08,449

Looking at this, the problem might not even be apparent: it uses a comma to separate seconds from the fractions part.

You might argue that this is not an important problem at all, who needs fractions anyway?

There are two issues with this:

  1. Fractions might be important to some (e.g. for ordering with thousands of logs per second).
  2. It breaks the parsing the message itself (as the timestamp is embedded in a larger message), causing message related metadata to be incorrectly extracted (e.g. which device you want to attribute this message). This means that your dashboard in a SIEM may miss vital information.

And this is not the only similar case. See for this pull request for example for a similar example.

This is exactly why application awareness is important, fixing these cases means that your log data becomes more usable as a whole.

Usually it is not the programming of the solution that is difficult here, rather the difficulty lies in having to learn that the problem exists in the first place. If you have a similar parsing problem, please let us know by opening a GitHub issue. The past few such problems were submitted to us by the Splunk Connect for Syslog team, thanks for their efforts. Btw, sc4s is great if you want to feed syslog to Splunk and it uses syslog-ng internally.

On a similar note, we have improved the cisco-parser() that extracts fields from Cisco gear and added a parser for MariaDB audit logs. Both of these parsers are part of our app-parser() framework.

Others

There are a few other features I find interesting, just a short summary

  • Type support is nearing completion. We added support for types in template expressions, groupset() & map-value-pairs().
  • We improved syslog-ng’s own trace messages: we added the unique message ID (e.g. $RCPTID) as a tag in all message related trace messages, so that you can correlate trace messages to a specific message. We also included type information as a part of the type support effort.
  • We improved handling of list/array like data in this pull request.
  • We extended our set of TLS options by adding support for sigalgs & client-sigalgs.

 

syslog-ng on the long term: a draft on strategic directions

syslog-ng on the long term: a draft on strategic directions

I made a promise some posts ago that I would use this blog both for collecting feedback and to provide information about potential next steps ahead of syslog-ng. In the same post, I also promised that you, the syslog-ng community, would have a chance to steer these directions. Please read on to find out how to do that.

In the past few weeks I performed a round of discussions/interviews with syslog-ng users. I also spent time looking at other products and analyst reports on the market. Based on all this information I’ve come up with a list of potential strategic directions for syslog-ng to tackle. Focusing on these and prioritizing features that fall into one of these directions ensures that syslog-ng indeed moves ahead.

When I performed similar goal setting exercises in my previous CTO role at Balabit, our team made something similar:

  1. brainstorming on potential directions,
  2. drafting up a cleaned up conclusion document,
  3. validating that the document is a good summary of the discussion and
  4. validating via customers that they are indeed a good summary of what the customers need.

syslog-ng is an Open Source project, so I wanted to involve the community somehow. Organizing a brainstorming session sounds difficult on-line (do you know good solutions for this?). So I wanted to create an opportunity to talk with the broad community about my thoughts somehow, in a way that leads to a useful conclusion. This is the primary intent behind this post.

Once you read the directions below, please think about if you agree with my choice of directions here! Are these indeed the most important things? Have I missed something? Do you have something in mind that should be integrated somehow? Which of the directions do you consider the most important?

Please give your feedback via this form https://forms.gle/xJ2heSHeVb7ZHUHH9, write a comment  on the blog or drop me an email. Thanks.

1. The Edge

syslog-ng has traditionally been used as a tool for log aggregation, e.g. working on the server side. That’s why its CPU and memory usage has always been in focus. Being able to consume a million (sometimes millions!) of messages a second is important for server use-cases, however I think that in exchange for this focus, syslog-ng has neglected the other side of the spectrum: the Edge.

The Edge is where log messages are produced by infrastructure and applications and then sent away to a centralized logging system.

syslog-ng trackles the original “syslogd-like” deployment scenarios on the Edge, but lacks features/documentation that make it easy to deploy it in a more modern setting, e.g. as a part of a Kubernetes cluster or as a part of a cloud-native application.

Apart from the deployment questions, I consider The Edge to be also important for improving data quality and thus improving the usefulness of collected log data. I see that in a lot of cases today, log data is collected without associated meta-information. And without that meta information it becomes very difficult to understand the originating context of said log data, limiting the ability to extract insights and understanding from logs.

These are the kind of features that fall into this bucket, in no particular order:

  • Transport that is transparently carrying metadata as well as log data, plus multi-line messages (this is probably achieved by EWMM already)
  • Kubernetes (container logs, pod related meta information, official image)
  • Document GCP/AWS/Azure deployments, log data enrichment
  • non-Linux support (Windows and other UNIXes)
  • Fetch logs from Software as a Service products
  • etc

2. Cloud Native

The cloud is not just a means to deploy our existing applications to a rented infrastructure. It is a set of engineering practices that make developing applications faster and more reliable. Applications are deployed as a set of microservices, each running in its own container, potentially distributed along a cluster of compute nodes. Components of the applications managed via some kind of container orchestration system, such as Kubernetes.

Being friendly to these new environments is important, as new applications are increasingly using this paradigm.

Features in this category:

  • Container images for production
    • as a logging side-car to collect app logs and transfer them to the centralized logging function or
    • as an application specific, local logging repository (e.g. app specific server)
  • HTTP ingestion API
    • these apps tend to communicate using HTTP, so it is more native to use that even for log ingestion
    • maybe provide compatibility with other aggregation solutions (Elastic, Splunk, etc)
  • Object Storage support
  • Stateless & persistent queueing (kafka?)
  • etc

3. Observability

The term observability roots in control theory, however it is increasingly applied to the operations of IT systems. Being observable in this context means that the IT system provides an in-depth view into its inner behaviours, making it simpler to troubleshoot problems or increase performance. Observability today often implies three distinct types of data: metrics, traces and logs.

I originally met this term in relation to Prometheus, an Open Source package that collects and organizes application specific metrics in a manner that easily adapts to cloud native, elastic workloads. Traditional monitoring tools (such as Zabbix or Nagios) require a top-down, manual configuration, while Prometheus reversed this concept and pushed this responsibility to application authors. Applications should expose their important metrics so that application monitoring works “out-of-the-box”. This idea quickly gained momentum as manually configuring monitoring tools to adapt automatically scaled application components is pretty much impossible.

Albeit observability originally comes from the application monitoring space, its basic ideas can be extended to cover traces and logs as well.

Features in this category:

  • Being observable: provide a prometheus exporter so that we can become observable out-of-the-box
  • Interoperate with Observability platforms
    • Loki destination
    • Support for OpenTelemetry (source and destination)
    • convert logs from metrics/traces and vice-versa

4. Application awareness

syslog has been a great invention: it has served us in the last 40-45 years and its importance continues into the future. Operating systems, network devices, IoT, applications, containers, container orchestration systems can all push their log data to syslog. For some of those, using syslog is the only option.

In a way syslog is the common denominator of all log producing IT systems out there and as such it has become the shared infrastructure to carry logs in a lot of environments.

In my opinion, the success of syslog stems from the simplicity of using it: just send a datagram to port 514 and you are done. However this simplicity is also its biggest limitation: it is under-specified. There have been attempts at standardization (RFC3164 and RFC5424) but these serve more as “conventions” than standards.

The consequence is that incompatible message formats limit the usefulness of log data, once collected in a central repository. I regularly see issues such as:

  • unparseable and partial timestamps
  • missing or incorrect timezone information
  • missing information about the application’s name (e.g. $PROGRAM) or hostname
  • incorrectly framed multi-line messages
  • key=value data that is in a format downstream systems are unable to parse

Sometimes it’s not the individual log entry that is the problem, rather the overly verbose logging format that becomes difficult to work with once you start using it for dashboards/queries:

  • The Linux audit system produces very verbose, multi-line logs about a single OS operation
  • Mail systems emit multiple log entries for a single email transaction, sometimes a separate log entry for each attachment.
  • etc

syslog-ng has always been good in the various heuristics to properly extract information even from incorrectly formatted syslog messages, however there are extreme cases where applications omit crucial information or use a syntax so far away from the spec that even syslog-ng is unable to parse the data correctly.

Application awareness in this context means the ability of fixing up the syslog parsing with the knowledge of the application that produced it. It is difficult to craft heuristics that work with all incorrect formats, however once we start with identifying the application, then we can correctly determine what the log message was intended to look like. Fixing these issues before the message hits a consumer (e.g. SIEM) helps a lot in actually using the data we store.

Also, being application aware also implies that log routing decisions can become policy aware. “Forward me all the security logs” is a common request from any security department. However actually doing this is not simple: what should constitute as “security”? Being application aware means that it becomes possible to classify based on applications instead of individual log messages.

Features in this category:

  • classifying incoming logs per application (e.g. app-parser() and its associated application adapters)
  • fix incoming logs and make them formatted in a way that becomes easier to handle by downstream consumers (timestamps, multi-line messages, etc.)
  • translate incoming logs into a format that a downstream system best understands

5. User friendliness

syslog-ng is a domain specific language for log management. Its performance is a crucial characteristic, but the complexity of operations performed by syslog-ng, still within the log management layer has grown tremendously. Making syslog-ng easier to understand, errors and problems easier to diagnose is important in order to deal with this complexity. Having first class documentation is also important for it to succeed in any of these directions, described above.

So albeit not functionality by itself, I consider User friendliness a top-priority for syslog-ng.

Features in this category:

  • syntax improvements can go a long way of adopting a feature. syslog-ng has always been able to do conditional routing of log messages however if()/elif()/else went a long way in getting it adopted. There are other potential improvements in the syntax that could help reading/writing syslog-ng configurations easier.
  • configuration diagnostics: better location reporting in error messages, warnings, etc.
  • interactive debuggability: as syslog-ng is applied to more complex problems, the related configuration becomes more complex too. Today, you have to launch syslog-ng in foreground, inject a message and try to follow its operations using the builtin trace messages. Interactive debugging would go a long way in making the writing and testing these functionalities.

Those are roughly the directions I have in mind for the future of syslog-ng. If you disagree or have some comments, please provide feedback via the form at: https://forms.gle/xJ2heSHeVb7ZHUHH9

syslog-ng future: the path to syslog-ng 4

syslog-ng future: the path to syslog-ng 4

syslog-ng 3.0.1 was released 17th February 2009, almost exactly 13 years ago. The key feature at that point was to add support for RFC5424, the new “syslog” protocol. The 3.0 release marked a significant conceptual change in syslog-ng as this was where we introduced support for generic “name-value pairs”, a means to encode application or organization specific fields (aka name-value pairs as we named them) associated with a log message.

The 3.x release train has been a long and a busy one. We are right now at 3.35.1 with 3.36.1 right around the corner. Not counting bugfix releases, that’s ~4 releases per year on average. This pace was slower initially (~1 release/year) which then increased due to all the engineering practices that we implemented in the last decade: syslog-ng is a very well tested application today, covered both in terms of unit tests and functional, end-to-end testing. In the last years, the syslog-ng project has produced 5-6 releases per year (every ~2 months), in a rolling model. Apart from features and bugfixes we also had a sharp focus on compatibility and avoiding regressions.

When I started to draft this post, I compiled a list of noteworthy features that were created since 3.0.1 in 2009. My intention with the list was to include it here to back up my previous claim that there are lot of undiscovered and under-communicated aspects of syslog-ng. However, when I finished with the list, I had to realise that even if I trim it down, it is still too long to discuss it in a blog post at one go. For now, I’ve uploaded my raw notes here. I am probably going to use that list to publish technology pieces on the blog or create a survey to map out which are the more interesting items to syslog-ng users. I don’t know yet.

This post however, is not about the past, the title says it all: it is about the path to syslog-ng 4. With the relaunch taking place, I was thinking what else could be better to symbolize a restart than a new major version? With that we can take a moment to reflect on the 3.x series and start anew with fresh energy.

It is very important to state that syslog-ng 4 is not the revolutionary, break-everything kind of release that we see too often in the software world. Rather it is an evolutionary change that will be produced similarly to previous releases, that is:

  • the release will contain both features and bugfixes
  • if a change in behaviour is unavoidable, we keep being compatible using the config version mechanism, e.g. the “@version:” tag in the front of the config file
  • compatibility with old config versions are retained long term (e.g. we are compatible back to 3.0, with compatibility back to 2.0 dropped just a couple of years ago)

But why the fuzz, you may ask, about a new version number if nothing changes and we do exactly as before?

Well, there are some plans scheduled for 4.0 (more on those later), but I consider this release to be an opportunity to set up new, long term objectives. Objectives that will cover the upcoming releases as well and not just 4.0 itself. With the launch of this blog and through interactions with the community, I already have some thoughts of my own, still, I would like to allow community members to contribute even on the strategic level. Let’s find the mission statement for syslog-ng that covers the next 10 years and then guide the project towards those goals with a step in each release. I am posting the specifics and the mechanism of this work in an upcoming post. Until that post, please continue to send me feedback (via Email, gitter.im, GitHub, Reddit, LinkedIn whatever you like), I am truly enjoying each and every one of these interactions and make an effort to respond to all your queries. Also, the syslog-ng project started to use GitHub’s discussion feature, so if you have a suggestion with regards to syslog-ng 4, feel free to submit it here.

Release management and Support

So how would the release of 4.0 happen? Is this a new branch over 3.x? How long would we support 3.x?

These are all valid questions, however the answer is simple: syslog-ng 4 is nothing more than a 3.x release in this respect. We will add features and bugfixes and compatibility will be provided using the config version feature (ie. @version). We will make no breaking changes that we cannot continue to be compatible with. There will be no separate 3.x and 4.x releases going in parallel. If we break something, fixes would be pushed out in upcoming versions (either the scheduled one or an emergency one if the problem is critical). We are confident that our current test coverage gives us a safety net that allows us to use this release strategy.

At the same time, we are scheduling some larger-scale changes that will probably not fit into a normal 8 week release cycle we do these days. We don’t want to stop doing our 3.x releases and we don’t want to publish half-baked features. So how are we going to resolve this conflict?

The regular bugfix/feature flow of 3.x will continue to operate as before. Any 4.0 related functional change will be merged to master (and thus make it into 3.x releases) but any functional change will be disabled.

Once all 4.0 related changes are merged, a 4.0.1 release will be created, effectively turning on the new behaviours, except if the user operates in `@config: 3.x` mode, which is the usual method  to tell syslog-ng to operate in compatibility mode.

All of this basically means the following:

  • the 3.x feature and bugfix flow operates as normal
  • the 4.x related changes get merged and can be evaluated if someone is interested (by using “@version: 3.255” at the top of your configuration file)
  • no half-baked functionality is exposed, even if they take longer to bake than the 8 week release cadence.
  • all protected by our testing infrastructure

Up until now, only the versioning framework was merged with some more queued for merging. Details on some of the plans for 4.0 are coming in separate posts. Stay tuned!

syslog-ng 3.37 released

syslog-ng distribution and support bottleneck

I find that a lot of syslog-ng deployments are lagging behind and are using ancient versions. It has become difficult for me to get these deployments to more recent versions. No product is able to improve and cover new ground in a situation like this…

Being ancient is a relative term: for instance, in the JavaScript world it is considered ancient if you are using a framework that was initially released two or more years ago. New hypes and incompatible rewrites are published at a pace which makes the JavaScript ecosystem difficult to follow.

Maintaining this change velocity in the log management space is not feasible. Deploying a log management and processing infrastructure from scratch can literally take years just one time. Swapping out technologies every now and then on a whim would mean that the project never reaches the goals it was set out to achieve.

With that I said I still think that being able to regularly push out updates to deployments is an important bottleneck to solve for any product to be sustainable. This is needed for both the feature front (e.g. addressing new use-cases) and on the support perspective (e.g. fixing bugs).

I often get questions about syslog-ng 3.5.6. This release was originally published 5th August 2014, roughly 8 years ago, and happens to be part of EPEL7. syslog-ng is included in BMW i3 vehicles, this video shows the listing of open source components on the infotainment screen, The BMW Open Source DVD contains syslog-ng 3.4.7, a whooping fresh release from December 2013. There are similar stories with syslog-ng included in products or an OS release, usually with pretty old versions.

Why does this happen?

Due to the early adoption of syslog-ng, it was included in a number of Linux distributions and BSDs/UNIXes, even became default in some of them. I considered this a great success.

For none of these distributions however is log management a central question. They each need some kind of log daemon, but that’s it. Whether that log daemon is syslogd, rsyslog or syslog-ng does not really matter. Neither matters their actual version number. So even though distributions helped initial syslog-ng adoption, they have become a bottleneck in delivering new releases to users.

Users can still upgrade, right?

Enterprise users (and products that embed Linux and syslog-ng) pick an OS version and plan with it for ~10 years. Unfortunately they deploy syslog-ng as a part of the OS and expect the OS vendor to provide support. Often, the sysadmin responsible for log management is not even allowed to upgrade. Some claim that upgrading syslog-ng would violate their support terms, causing the entire OS to become unsupported.

So even though more recent versions of syslog-ng includes functionality or fixes they need, they stick to the old version and try to work around any issues they find.

The support from the OS vendor for the logging component is questionable at best and is restricted only for the most basic use-cases, not cases where syslog-ng would play an important role in one’s infrastructure. Just as log management is not a central focus for the OS, neither is it for the support team behind the OS. They would fix security issues, should they be reported, but otherwise they will just continue to use what they have.

Solution: state of the art binaries to pick from

Building the latest version of syslog-ng for your enterprise distro on your own is not for the faint of heart. Even though 20 years ago, building your own kernel or application was an essential part of a sysadmin’s job on any UNIX, this is not true any more.

Also, it was a lot easier to build syslog-ng in 2001, today we have so many integrations that pulling all the build dependencies (and the right versions) is far from trivial.

We worked hard in the past years to resolve this issue and today syslog-ng is not only available in source format. There are a number of options today to pick from, should you want to use the latest and greatest:

Over time, the building of bespoke/customized packages has become much easier too, this blog post explains it all.

So what’s your excuse? I am really interested if the options above suffice. Do you still use an old syslog-ng version? Why? Would any of the above work for you? If not, What would YOU need to upgrade syslog-ng to recent versions? And what would you need to change your processes to plan for upgrades regularly?

If you have a response to any of these questions, please post it as comment below or drop me an email. Thanks.

syslog-ng-future.blog? Is this a fork or what?

syslog-ng-future.blog? Is this a fork or what?

I mentioned in the previous post that I would like to focus on syslog-ng and put it more into the spotlight. I also mentioned that Balabit, the company I was a founder of and the commercial sponsor behind syslog-ng, was acquired by One Identity ~4 years ago. How does this add up? Who owns the Intellectual Property (IP) for syslog-ng? Who am I or this blog affiliated with?

I felt this post was important to set things straight and make it easier to understand my motivation. If you are not much into Free Software and Open Source licenses or not interested too much in administrative nuisances of FLOSS projects, feel free to skip this post.

First of all, the IP in syslog-ng that Balabit owned originally, was transferred to the acquirer, One Identity. This includes:

  • copyright on the documentation and parts of the codebase
  • trademarks, website, marketing stuff

The good news in here is, that not all of the codebase is copyrighted by its commercial sponsor. Back in the Balabit days we enacted a change of the licensing regime in 2010 as described here: bazsi.blogspot.com/2010/07/syslog-ng-contributions-redefined.html.

With this change we’ve stopped requiring signed CLAs (Contributor’s License Agreements) whenever someone contributed to syslog-ng. This means that the copyright of any outside contributions would be retained by the contributor and not assigned to Balabit or its successors.

Over the years, many such outside contributions were merged into the syslog-ng codebase, meaning that the code today is owned by many different individuals and companies. Copyrights of files are tracked by the tests/copyright/policy file in the source tree: you can note that there are some files that are external contributions in their entirety. Some files have mixed copyrights, partly owned by One Identity, partly by the outside contributor.

Since the license syslog-ng uses is a combination of GPL + LGPL, the combined work as such is free software forever. The GPL/LGPL warrants that anyone can get the source code and be able to change it in any way he or she wishes. The sole requirement of the license is that should you distribute syslog-ng to any 3rd parties, you would need to disclose that you were using GPLed code and offer the source code along with any changes you have made.

The same rules apply to the commercial sponsor as well: as long as it relies on the work that was produced by external contributors over the years, they will need to publish any changes to syslog-ng they create. The only way out would be to rip out all code created by external contributions.

This means that syslog-ng today is a truly open source project, at least from the licensing perspective

But there’s another perspective, namely whether there is an active developer community that adds new features and publishes new releases.

Fortunately, this also holds true. The syslog-ng project lives on https://github.com/syslog-ng/syslog-ng, contributions are transparently managed in GitHub pull requests, either if created by One Identity or someone else in the community. Regular releases are produced on a 8-10 weeks cadence and are published both in source + binary formats and a docker image.

With all of the explanations above, the status of this site and myself can easily be explained: I am not affiliated with One Identity (the successor of Balabit) in any way. I am an individual who contributes time and energy to the open source syslog-ng project, just as I did the same in the last ~24 years.

This is a personal blog, related to syslog-ng and producing useful fixes and features for syslog-ng as an open source project. I am not sponsored or endorsed by One Identity. I am here to help finding out where syslog-ng should go next.

The consensus of the FLOSS community is that a project is only considered truly open source if the developer base/IP is not concentrated within a single company/organization. The reason for this understanding is simple: if there’s only one such entity then that entity has too much control/power over the project. If the company goes bankrupt or changes hands, then priorities might change in a way that causes the open source project to suffer. The long term sustainability of an open source project hinges on the breadth of its contributor base: the broader it is, the more likely it is that the open source project can outlive its creators or commercial sponsors.

The dependence of the syslog-ng project on Balabit as its commercial sponsor has been an issue since ~2009 and probably the reason why it has not become the default logging daemon in Fedora, thus RedHat Enterprise Linux. I don’t think the inclusion in Linux distributions is the prime venue of competition today as it once was. But this decision by Fedora still hurts. 🙂

In a way, the acquisition of Balabit, and my departure from One Identity later allows syslog-ng to become a truly independent-, sustainable open source project.

Stay tuned!