LogGuideWiki

/var/log is a strange and largely unexplored place full of strange creatures with cryptic names like dmesg, kern.log or syslog. Strangers in this place are sometimes greeted with utterances like RIP: 0010:nl80211_send_chandef+0x142/0x160 [cfg80211]. Should they be worried or is this just kern.log‘s way of saying ‘hello’?

LogGuideWiki is a friendly guide to the strange lands of /var/log that helps the traveler make sense of what is going on. It draws its knowledge from a wiki-style guidebook to the languages spoken in these parts.

Supervisors

Artefact(s)

The aim of the project is to develop the component parts of LogGuideWiki:

  • a wiki-style repository of information about log file entries, their constituent parts and meanings. This could be an adaptation of an existing Wiki system, preferably one that supports version control with Git.
  • a tool that interprets the log file entries found on a (Linux) system and displays information to the user that allows them to interpret the entry and seek more information. This could be a simple command-line tool, a terminal-based tool (ncurses, curses, lanterna?) or something with a web interface.

The basic system might use a set of regular expressions to match log file entries and should allow the results of the matching to be used in the user output. It should collect statistics about the frequency of particular entries, allow users to look at similar entries, etc.

It might also be useful to include a machine learning component that identifies new types of entries that might be important as they indicate a security problem or another problem with the system. Also, it will be very inefficient to try each regular expression against each possible log entry, so might be a good idea to develop heuristics to prioritise expressions to try, e.g., based on prefixes that are easily matched.

The system can be written in any language but I would prefer these: Python, Java, JavaScript, Julia (am keen to learn and it should be possible to compile to a binary using PackageCompiler).

 

Background

There are a lot of blog posts and web pages that tell you how to access the logs and what types of log files exist. An example is this one from Ubuntu. However, there does not seem to be much on what the contents of the log files mean, so interpreting any one of them seems to be mainly a quest to find a useful entry somewhere using one's favourite search engine. It might be neccessary to study the Linux kernel sources and to read widely on StackOverflow to build some initial entries for the guide. However, this is not the main aim of the project, which is about building the tools to allow a guide to be created by the community and the tool for using it on a system.

One thing that would be of use is good a good command of regular expressions. Log entries often span multiple lines, so regular expressions will likely be more complicated than what you might have used before. To use some of the content of a log file entry, the use of capture groups will also be important.