Logging

A journey through time

by Kurt Neufeld
www.burgundywall.com
@kneufeld
github.com/py-yyc/pylogging

What is logging?

no

closer

INFO  2016-05-22 14:13:54,131 twisted:154: Log opened.
INFO  2016-05-22 14:13:54,154 dailydb.haystack_search:15: search: 'rude goldberg', num results: 2
WARNING 2016-05-22 14:13:54,166 twisted:154: /home/httpd/redpics/virt/lib/python2.7/site-packages/django/shortcuts.py:45: django.utils.deprecation.RemovedInDjango110Warning: The context_instance argument of render_to_string is deprecated.
WARNING 2016-05-22 14:14:25,164 django.request:103: Forbidden (Referer checking failed - no Referer.): /redpics/search/
INFO  2016-05-22 14:14:44,249 dailydb.haystack_search:15: search: 'rude goldberg', num results: 2
INFO  2016-05-22 14:18:03,033 dailydb.haystack_search:15: search: 'rude goldberg', num results: 2
INFO  2016-05-22 14:18:30,989 twisted:154: Log opened.
INFO  2016-05-22 14:18:31,012 dailydb.haystack_search:15: search: 'rude goldberg', num results: 2
WARNING 2016-05-22 14:18:31,025 twisted:154: /home/httpd/redpics/virt/lib/python2.7/site-packages/django/shortcuts.py:45: django.utils.deprecation.RemovedInDjango110Warning: The context_instance argument of render_to_string is deprecated.
INFO  2016-05-22 14:18:31,353 imagekit.specs:78: in here
ERROR 2016-05-22 14:18:31,364 django.request:284: Internal Server Error: /redpics/search/
Traceback (most recent call last):
  File "/home/httpd/redpics/virt/lib/python2.7/site-packages/django/core/handlers/base.py", line 149, in get_response
    response = self.process_exception_by_middleware(e, request)
  File "/home/httpd/redpics/virt/lib/python2.7/site-packages/django/core/handlers/base.py", line 147, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
AssertionError
      

ooohh...

What is logging?

logging
(verb) a series of messages emitted from an application
message
(noun) a distinct piece of information
emitted from an application (with a timestamp)
almost always a single line of text
quick example
DEBUG 2015-03-14 06:00:01,700 pgrabber.imgparser:52: get_url_contents: http://dilbert.com/

Why log?

There are several reasons why you want to enable logging.

  1. aid in development
  2. have a record of errors
  3. application monitoring
    1. metrics that took 3ms
    2. analytics somebody registered!
    3. alerts getting low on disc space
  4. the law :(
  5. security audits :)
  6. have a history of low level information for future aid in debugging
    • this is the through time part
    • this is a gift to your future self

Dissect Log Message

DEBUG 2015-03-14 06:00:01,700 pgrabber.imgparser:52: get_url_contents: http://dilbert.com/
PieceValue
levelDEBUG
time stamp2015-03-14 06:00:01,700
locationpgrabber.imgparser:52
messageget_url_contents: http://dilbert.com/

Who are logs for?

As previously mentioned, logs are for your future self. This implies that logs are for humans. And they certainly can be, but more often than not...

Logs are for computers. Humans don't run analytics and create graphs by hand, computers do that.

More on this later

Python Logging

complicated to set up

fantastic once that's done

Logging functionality

  • can be configured from code
  • can be configured from a file
  • can rotate the log files
  • can log to a file, stdout, socket, database
  • all at the same time!
  • custom formatters, handlers, etc
  • colors!

Concepts

levels
each message an associated level, can filter on that
handler
how a message is handled (write to file, stdout, etc)
formatter
how a message is formatted (show time, level, location, etc)
logger
instance that can have any of the above
propagate
pass message to another logger

Ramifications

  • critical messages can be emailed to devops
  • debug messages can be written to fast drives
  • turn logging on/off for modules
  • different loggers can have different formatting
  • you can turn off loggers

How to setup

https://docs.python.org/2/library/logging.html

any questions?

Simplest case

don't do this


import logging

logging.info("info message") # nothing emitted
logging.warn("warning message")
# WARNING:root:warning message
          

Next simplest case

maybe do this


import logging
FORMAT = '%(levelname)-5s %(filename)s:%(lineno)d: %(message)s'
logging.basicConfig( level=logging.DEBUG, format=FORMAT )

logging.debug("debug message")
# DEBUG main.py:5: debug message
          

but don't do that

More better (high level)


# main.py
configure_logging()
          

# a.py
import logging
logger = logging.getLogger( __name__ )

logger.debug( "value is: %d", i )
          

# b.py
import logging
logger = logging.getLogger( __name__ )

try:
  raise Exeption
except Exception as e:
  logger.error( "exception: %s", e )
  raise
          

Log configuration


# via dictionary
import logging.config
from logconf import logconf
logging.config.dictConfig( logconf() )
          

# via file
import logging.config
from logconf import logconf
logging.config.fileConfig( "logging.conf" )
          

# via code
import logging.config

logger = logging.getLogger()
logger.setLevel( logging.NOTSET )
...
          

Log configuration


def logconf():

    return {
      'version': 1,
      'disable_existing_loggers': False,
      'formatters': {
          'thread_verbose': {
              'format': '%(levelname)-5s %(asctime)s %(name)s:%(lineno)d %(process)d %(thread)d: %(message)s'
          },
          'verbose': {
              'format': '%(levelname)-5s %(name)s:%(lineno)d: %(message)s'
          },
          'simple': {
              'format': '%(levelname)-5s %(name)s:%(lineno)d: %(message)s',
              'datefmt': '%H:%M:%S',
          },
      },
      ...
          

Log configuration


...
  'filters': {
      'require_debug_false': {
          '()': 'django.utils.log.RequireDebugFalse'
      }
  },
...
          

Log configuration


  ...
  'handlers': {
      'null': {
          'class':'logging.NullHandler',
      },
      'mail_admins': {
          'level': 'ERROR',
          'filters': ['require_debug_false'],
          'class': 'django.utils.log.AdminEmailHandler'
      },
      'console':{
          'level': 'DEBUG',
          'class':'logging.StreamHandler',
          'formatter': 'simple'
      },
      'logfile': {
          'level': 'DEBUG',
          'class':'logging.handlers.GroupWriteRotatingFileHandler',
          'filename': '%s.log' % PROJECT_NAME,
          'maxBytes': 500000,
          'backupCount': 2,
          'formatter': 'verbose',
      },
  },
...
          

Log configuration


...
  'loggers': {
      '': { # default/root catch all
          'handlers': ['logfile'],
          'propagate': False,
          'level': 'WARNING',
      },
      'requests': { # requests library
          'handlers':['null'],
          'propagate': False,
          'level':'WARN',
      },
      'django': {
          'propagate': True,
          'level':'INFO',
      },
      MY_PROJECT: {
          'handlers': ['console'],
          'propagate': True,
          'level': 'DEBUG',
      },
  },
} # end of dictionary

          

libraries should never
format loggers

just get a logger
and then log

Structured Logging

There is a growing movement called structured logging

If you know your log messages are going to be consumed by a computer (vs a human) then why not make it easier on the computer (and the human writing the code) by making the log message easily parseable.

Write your log messages in JSON!

(or xml if you hate yourself)

python docs

Structured Log

currently


logger.info( "%s: rendering web page took %d ms", self, render_time() )
INFO 10:10:10.10 cheetah: rendering web page took 34 ms
            

is now


logger.info( "render_time", what=self, took=render_time() )
INFO, 10:10:10.10, "render_time", { what="cheetah", took=34 }
            

which is easy to parse for both a computer and not too bad for a person

Parsing A Structured Log


            level, time, event, fields = json.loads(msg)
            

that's it , you're done

that works for every single structured log message

what you do with it is left as an exercise to the reader

Performance

any logging framework worth its salt early exits if the log level is too low


for i in xrange(1000000): pass
time: 0m0.019s # baseline for loop
            

# set log level to info
for i in xrange(1000000): logger.debug("%d",i) # log level too low
time: 0m0.061s # 0.042 seconds slower
            

# set log level to debug
for i in xrange(1000000): logger.debug("%d",i) # no longer too low
time: 0m37.931s # piped to /dev/null
            

if you need one million log messages 38 seconds is probably a small price to pay to get them

so I don't want to EVER hear that you don't log because it's too slow

EVER!!1!

3rd Party Software

sentry

  • https://getsentry.com
  • written in python
  • dozens of language bindings
  • self hosted or paid service
  • pip install sentry
  • use a virtenv, lots of deps

3rd Party Software

logstash + =

Logstash

I've only just started playing with logstash so I'm nowhere close to an expert. Check it out online and watch some youtube videos.

Having said that, here's my expert opinion.

It's freaking awesome!

Logstash parses the logs and then feeds these records to elasticsearch which can then be searched and graphed using kibana. Kibana is a search and graphing frontend to elasticsearch.

THE END

questions?

before code examples.

!-- reveal -->