Saturday, 14 April 2018

Live reload ClojureScript & JavaScript during Cordova app development

Using ClojureScript for Cordova app development is described in a previous post. This post will expand on how to do live reloading when code changes and also writing JavaScript unit tests with QUnit. The a sample code structure is given below.
example-cordova-app
├── Gruntfile.js
├── config.xml
├── hooks
├── node_modules
├── package-lock.json
├── package.json
├── platforms
│   └── browser
│       ├── browser.json
│       ├── ...
├── plugins
├── res
├── example-cordova-cljs
│   ├── out
│   ├── project.clj
│   ├── resources
│   ├── src
│   │   └── my_app
│   │       └── core.cljs
│   ├── target
│   └── test
│       └── my_app
│           └── core_test.clj
└── www
    ├── css
    │   └── style.css
    ├── img
    ├── index.html
    ├── js
    │   ├── app.js
    │   ├── libs
    │   ├── main.js
    └── test
        ├── qunit.css
        ├── qunit.js
        ├── test.html
        └── tests.js
We can use browser platform to do quick testing during development and cordova-plugin-browsersync to do live reloading of www folder.
1. Install cordova-plugin-browsersync.
cordova plugins add cordova-plugin-browsersync
2. Once the plugin is installed, we can start the watcher from terminal.
cordova run browser -- --live-reload
When ClojureScript changes, it compiles and places the file into www, and the above plugin will detect the change and do a reload. Refresh the browser and the latest code changes will be reflected. This is also useful when we mix JavaScript and ClojureScript.

Unit Testing with QUnit
We can write ClojureScript test, which is a different workflow. Place test scripts under www/test folder. A sample test.html is shown below.
<!doctype html>
<html>
<head>
    <link rel="stylesheet" type="text/css" href="qunit.css">
    <script type="text/javascript" src="qunit.js"></script>
    <script type="text/javascript" src="tests.js"></script>
    <title>Testsuite</title>
</head>
<body>
    <div id="qunit"></div>
    <div id="qunit-fixture"></div>
</body>
</html>
Test scripts can go to test.js.
// test.js
if (document.loaded) {
    test();
} else {
    window.addEventListener('load', test, false);
}

function test() {
    QUnit.module("test");
    QUnit.test("Example", function (assert) {
        assert.ok(true, "ok is for boolean test");
        assert.equal(1, "1", "comparison");
    });
}
We will use grunt to run these tasks. Add both live reload and unit test tasks in Gruntfile.js.
module.exports = function(grunt) {
    grunt.initConfig({
        pkg: grunt.file.readJSON('package.json'),
        qunit: {
            files: ['www/test/**/*.html']
        },
        exec: {
            start: {
                command: 'cordova run browser -- --live-reload'
            }
        }
    });

    grunt.loadNpmTasks('grunt-contrib-qunit');
    grunt.loadNpmTasks('grunt-exec');

    grunt.registerTask('test', ['qunit']);
    grunt.registerTask('start', ['exec:start']);
};
The required dependencies added to package.json follows.
{
    "name": "com.qlambda.example.app",
    // ...
    "main": "main.js",
    "scripts": {
        "start": "cordova run browser -- --live-reload",
        "test": "grunt test"
    },
    "dependencies": {
        "browser-sync": "^2.23.6",
        "cordova-browser": "^5.0.3",
        "cordova-plugin-browsersync": "^1.1.0",
        "cordova-plugin-whitelist": "^1.3.3",
        // ...
    },
    "cordova": {
        "plugins": {
            "cordova-plugin-whitelist": {},
            "cordova-plugin-browsersync": {}
        },
        "platforms": [
            "browser"
        ]
    },
    "devDependencies": {
        "grunt": "^1.0.2",
        "grunt-contrib-qunit": "^2.0.0",
        "grunt-exec": "^3.0.0"
    }
}
The tasks start, test are available as grunt task and we can link main ones to npm as well.
# grunt
grunt start  # start watcher
grunt test   # run testsuite

# npm
npm start
npm test

Sunday, 8 April 2018

macOS Server is a disappointment

macOS Server is very much a disappointment. The main reason for me to use it is because of the quick setup of Calendar, Contact, Notes, DNS and wiki (which I stopped using) which syncs with rest of the Apple devices when within the internal network, without having to install, configure and fiddle with these services separately. With each new update of the server app, Apple keeps removing features, which begs the question, is it going to be discontinued, which very much likely is.


With the latest update 5.6 more services are being removed to prepare for the migration to alternate services. The support article HT208312 lists the following services to be removed in fall 2018.
DHCP, DNS, VPN, Firewall, Mail Server, Calendar, Wiki, Websites, Contacts, Net Boot/Net Install, Messages, Radius, Airport Management
Apple recommends to install those services separately rendering the mac Server app useless (to me).

Thursday, 5 April 2018

Simple cowsay in Clojure

A simple version of cowsay in Clojure.
(ns fortune
  "Fortune fairy."
  (:require [clojure.pprint :as pprn]
            [clojure.string :as str])
  (:import [java.util.concurrent ThreadLocalRandom]))

(def tale ["I'll walk where my own nature would be leading: It vexes me to choose another guide."
           "Every leaf speaks bliss to me, fluttering from the autumn tree."
           "I see heaven's glories shine and faith shines equal."
           "I have to remind myself to breathe -- almost to remind my heart to beat!"
           "I’ve dreamt in my life dreams that have stayed with me ever after, and changed my ideas: they’ve gone through and through me, like wine through water, and altered the colour of my mind."])

(defn gen-random [lb ub]
  (-> (ThreadLocalRandom/current)
      (.nextInt lb ub)))

(defn gen-rand-txt []
  (nth tale (gen-random 0 (count tale))))

(def cowsay-body
"        \\   ^__^
         \\  (oo)\\_______
            (__)\\       )\\/\\
                ||----w |
                ||     ||")

(defn cowsay-hr [width]
  (println (pprn/cl-format nil "+ ~v@<~d~> +" width (apply str (repeat width "-")))))

(defn cowsay-txt-fmt [msg width]
  (println (pprn/cl-format nil "| ~v@<~d~> |" width (str/trim msg))))

(defn cowsay
  ([]
    (let [txt (gen-rand-txt)]
      (cowsay txt 0 21 21 (count txt))))
  ([msg]
    (let [txt (if (empty? msg) (gen-rand-txt) msg)]
      (cowsay txt 0 21 21 (count txt))))
  ([msg width]
    (cowsay msg 0 width width (count msg)))
  ([msg start end width len]
    (when (= start 0) (cowsay-hr width))
    (cond
      (<= end len) (do
                    (cowsay-txt-fmt (subs msg start end) width)
                    (recur msg (+ start width) (+ end width) width len))
      (< start end) (do
                      (cowsay-txt-fmt (subs msg start len) width)
                      (cowsay-hr width)                  
                      (println cowsay-body)))))
The format specifiers in cl-format is very expressive. Since this is a simple version, it just left aligns by characters.
boot.user=> (load-file "fortune.clj")
#'fortune/cowsay

boot.user=> (require '[fortune :as fortune])
nil

boot.user=> (fortune/cowsay)
+ --------------------- +
| I see heaven's glorie |
| s shine and faith shi |
| nes equal.            |
+ --------------------- +
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
nil

boot.user=> (f/cowsay (first f/tale) 0 13 13 (count (first f/tale)))
+ ------------- +
| I'll walk whe |
| re my own nat |
| ure would be  |
| leading: It v |
| exes me to ch |
| oose another  |
| guide.        |
+ ------------- +
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
nil

Boot Task
We can further use this as a boot task. Let's say we placed this in scripts under the project root and we add the below snippet to build.boot.
(def generic-pod (future (pod/make-pod (core/get-env))))

(deftask cowsay
  [m msg VAL str "Message to print"]
  (merge-env! :source-paths #{"scripts"})
  (pod/with-call-in @generic-pod (fortune/cowsay ~msg))
  identity)
This can be run as boot cowsay -m "Every leaf speaks bliss to me, fluttering from the autumn tree.".

Saturday, 31 March 2018

OpenSAML 3 Java - VU#475445 Workaround

As described in the vulnerability note VU#475445, OpenSAML 3 library (including Java library) is vulnerable to the assertions that use C14N canonicalization algorithm. Duo had found this CVE. A quick fix is below.
; ..
(:require [clojure.string :as str])
; ..

(def ^:dynamic *subject-val*)

(defn get-subject-from-node [nodes i c]
  (when (and (< i c) (not (realized? *subject-val*)))
    (let [node (.item nodes i)
          node-name (.getNodeName node)]
      (when (str/includes? node-name "Assertion")
        (get-subject-from-node (.getChildNodes node) 0 (.getLength (.getChildNodes node))))
      (when (str/includes? node-name "Subject")
        (get-subject-from-node (.getChildNodes node) 0 (.getLength (.getChildNodes node))))
      (when (str/includes? node-name "AuthenticationStatement")
        (get-subject-from-node (.getChildNodes node) 0 (.getLength (.getChildNodes node))))
      (when (str/includes? node-name "NameID")  ; SAML v2
        (deliver *subject-val* (.getTextContent node)))
      (when (str/includes? node-name "NameIdentifier")  ; SAML v1
        (deliver *subject-val* (.getTextContent node)))
      (get-subject-from-node nodes (inc i) c)))
  @*subject-val*)
This will extract subject from SAMLResponse in v1 or v2 format.
(defn get-subject [doc-elem assertion]
  (binding [*subject-val* (promise)]
    (let [name-id (.getNameID (.getSubject assertion))
          sub (.getValue name-id)
          resp-node (.getFirstChild (.getParentNode doc-elem))
          resp-node-childs (.getChildNodes resp-node)
          sub-cve (get-subject-from-node resp-node-childs 0 (.getLength resp-node-childs))]
      (if (= sub sub-cve) sub sub-cve))))
Here we proceed with the OpenSAML 3 unmarshalled object and obtained the subject value by calling (.getValue name-id). But since the library is vulnerable, we do a manual extraction of the subject value by calling (get-subject-from-node resp-node-childs 0 (.getLength resp-node-childs)). If they do not match, we return the sub-cve value as get-subject-from-node extracts the subject ignoring any comment added to the subject value.

Example
<saml:NameID Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress">2<!-- foo -->1@example.com</saml:NameID></code>
The assertion can be replaced with a subject value as the above. Since C14N ignores comment, even if we modify the XML, to add comments after signing, the validation will ignore the added comments while recomputing the signature.

NB: the subject 2<!-- foo --> should not be in escaped form in the SAMLResponse.

With this, opensaml 3 will give us sub as 1@example.com and the workaround function will give sub-cve as 21@example.com.

Sidenote
; ...
(:import [javax.xml.parsers DocumentBuilderFactory]
         [org.xml.sax SAXException])
; ...

(def disallow-doctype-dec "http://apache.org/xml/features/disallow-doctype-decl")
(def external-parameter-entities "http://xml.org/sax/features/external-parameter-entities")
(def load-external-dtd "http://apache.org/xml/features/nonvalidating/load-external-dtd")

(defn get-doc-builder 
  "Returns a document builder, which should be called for each thread as parse is not thread safe"
  []
  (let [doc-builder-factory (DocumentBuilderFactory/newInstance)]
    (.setNamespaceAware doc-builder-factory true)
    ;; prevent XXE
    (.setFeature doc-builder-factory disallow-doctype-dec true)
    (.setFeature doc-builder-factory external-parameter-entities false)
    (.setFeature doc-builder-factory load-external-dtd false)
    (.setXIncludeAware doc-builder-factory false)
    (.setExpandEntityReferences doc-builder-factory false)
    (.newDocumentBuilder doc-builder-factory)))

(defn parse-xml
  "Parse the given xml which can be a input stream, file, URI."
  [xml]
  (try
    (.parse (get-doc-builder) xml)
    (catch SAXException excep
      (throw (ex-info "XML parse exception." {:cause [:err-xml-parse]})))))
Here we use JAXP parser to parse the XML and use OpenSAML 3 to do the unmarshalling of the parsed XML org.w3c.dom.Document to OpenSAML 3 objects for easy extraction and validation of the SAMLResponse.

Friday, 23 March 2018

Everything is a Monad - Formalising Data Flow Pipeline in Clojure with Category Theory

A basic notion of data flow pipeline is described in a previous post, Data Flow Pipeline in Clojure. Here we will add some formalism to it using Category Theory applied in the context of Clojure. To relate to this, we can think of Haskell. Haskell is a language which is based on typed lambda calculus and it implements the concepts from category theory. Haskell defines its own category named Hask, objects and arrows belonging to it are bounded by laws. Now Clojure is untyped lambda calculus (but not in its strict sense as the reduction model is different, nevertheless a point which differentiates these two languages, plus no one defined any particular category in a way that the s-expressions are bounded by the laws of the category theory, which are not really required, but anyway). Now to the land of Clojure.

1. First we need to define a category, let's say turtle (name inspired from the first programming language I learned, LOGO).

2. A category consists of a collection of entities called objects and a collection of entities called arrows and few other properties (assignments and composition).

3. There are two assignments source (or domain), and target (or codomain) such that each of which attaches an object to an arrow. That is an object in source consumes an arrow and returns an object in the target. This can be represented as
:
                    f
  A -----------------------------------> B
source            arrow               target
domain           morphism            codomain
                   map
4. So to model this, let's use clojure.spec.
(spec/def ::variant vector?)
(spec/def ::tag keyword?)
(spec/def ::msg string?)
(spec/def ::state (spec/map-of keyword? any?))
(spec/def ::turtle-variant (spec/tuple ::tag ::msg ::state))
Here we defined the object to be a ::turtle-variant, the structure of which is [keyword? string? map?]. We need to check that the arrows takes the objects that belongs to the turtle category. Else the morphism cannot belong to the category.
(defn validate-variant 
  "Check if the given element belongs to the category turtle.
   Given a data structure, check if it conforms to a variant. If it does not explain the reason for non-conformance."
  [predicate]
  (let [rule (fn [f] (f ::turtle-variant predicate))]
    (if-not (rule spec/valid?)
      (rule spec/explain)
      true)))
5. Some helper functions.
(defn ok
  "Construct an object of type success in turtle which can passed to a functor. Returns a ::turtle-variant."
  [msg result]
  [:ok msg result])

(defn err
  "Construct an object of type error in turtle which can be passed to a functor. Returns a ::turtle-variant."
  [msg ex]
  [:err msg ex])
6. Now let us create an arrow that the objects in the category can consume and which can return an object. The return object must be in the category. Since the arrow can produce and return object, we can intuitively refer it to as functions or morphism. Here the object in the category is ::turtle-variant and since the morphism produce the same object, we call it endomorphism. It is a mapping of object in a category to itself. We define a functor such that it takes some transformation, but preserves the structure. Below we can see that the fetch takes in a ::turtle-variant and returns a ::turtle-variant.
(defn fetch
  "A functor which performs some operation on the object."
  [varg]
  (let [[tag msg {:keys [url opts] :as state}] varg
        [status body status-code] (http-get url opts)]
    (condp = status
      :ok (ok msg (merge {:res body} state))
      :err (err "Error" {:msg msg :excep body}))))

(fetch (ok "html" {:url "http://example.com" :opts nil})
7. The turtle category is not fully defined yet, because, we need to define partial composition of arrows. Identity can be defined trivially.
Arw x Arw -> Arw
Now, here is where we differ from the most monad libraries out there that tries to implement what Haskell does. Here partial composition does not mean, a function should return a partial. Certain pair of arrows are compatible for composition to form another arrow. Two arrows
:
      f                  g
A ---------> B1   B2 ---------> C
are composable in that order precisely, when B1 and B2 are the same object, and then an arrow
A ---------> C
is formed.

8. If we look at the way we defined the functions and the type constraint on the function, this law holds. Because here all functions takes the same structure and return the same structure, the ::turtle-variant.
At this point, we have defined our category turtle.

9. But from a programmer's perspective, simple having standalone functions are not useful. We need to compose them. Meaning, call a function passing arg if necessary, take the return, call another function with arg of the return of the previous function and such.
(defn endomorph
  "Apply a morphism (transformation) for the given input (domain) returning an output (co-domain) where both domain and
   co-domain belongs to the category turtle. The functors are arrows in the same category. This qualifies as a monad
   because it operates on functors and returns an object. And associative law holds because in which way we compose the functors,
   the result object is ::turtle-variant. This is also a forgetful functor.
   
   Decide whether to continue with the computation or return. All function in the chain should accept and return a variant,
   which is a 3-tuple."
  ([fns]
    (endomorph fns (ok "" {}))) 
  ([fns init-tup]
    (reduce (fn [variant fun]
      {:pre [(validate-variant variant)]
       :post [(if (reduced? %) (validate-variant (deref %)) (validate-variant %))]}
      (let [[tag _ _] variant]
        (condp = tag
          :err (reduced variant)  ;if tag is :err short-circuit and return the variant as the result of the computation
          :ok (fun variant))))
      init-tup
      fns)))
The endomorph is a monoid. A monoid structure is defined as
(R, *, 1)
where R is a set, * is a binary operation on R and 1 is a nominated element of R, where
(r * s) * t = r * (s * t)    1 * r = r = r * 1
In our case, the set R is the set of functions that operates on functors. And we have defined only one, which is the endomorph function. This function operates on functors (which are in itself endofunctors here). The binary operation is the reduce function and identity is trivial. We can use constantly and identity functions defined in Clojure to do that. The associative law hold because, here the order of composition does not matter because, in which ever way we order the composition, the result is the same object ::turtle-variant. That is from a theory perspective. But from a programmer's perspective, we cannot do arbitrary ordering as we need values within the structure. So conceptually it is associative, but the order of computation matters.

10. Now this is also a monad. The classic definition of a monad follows.
All told a monad in X is just a monoid in the category of endofunctors of X, with product * replaced by composition of endofunctors and unit set by the identity endofunctor
-- Saunders Mac Lane in Categories for the Working Mathematician
From the above definition, it is clear that the endomorph is a monad as it operates on endofunctors and itself is a monoid (from above). Now it is also a forgetful functor because some property is discarded once we apply the transformation.

Example usage:
;; Let's say we have many functions similar to fetch, which confirms to the contract defined
;; in the :pre and :post of reduce function of endomorph, we can chain them as below
(endomorph [fetch parse extract transform save] (ok "Init Args" {:url "http://example.com" :xs ".."}))

After all this proof, I came to the realisation that Leibniz was right about everything being a monad.

Monday, 12 February 2018

JSON Logging with MDC using Log4j2 in Clojure

This grew out of necessity and illustrates JSON logging with MDC in Clojure. Also, it is a general understanding that log4j2 async performance is better than any other logging libraries out there at this point in time.

The problem statement is application and the included library logs must output JSON formatted logs which can be directly given to Logstash endpoint. So the output format should be compatible with the defined Elasticsearch format. In my case there are some mandatory fields defined without which Elasticsearch will discard the logs. The approach is simple. Use pattern layout to log as JSON and have additional pattern converter keys defined as suited so that the necessary data object can be marshalled in the logs.
Below is the log4j2.xml.
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="debug" xmlns="http://logging.apache.org/log4j/2.0/config" packages="com.example.logger">
  <Properties>
    <!-- get from env instead -->
    <Property name="application">appName</Property>
    <Property name="app-version">0.0.1</Property>
    <Property name="host">localhost</Property>
    <Property name="env">localhost</Property>
  </Properties>
  <Appenders>
    <Console name="console" target="SYSTEM_OUT">
      <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
    </Console>
    <RollingRandomAccessFile name="plain-log" fileName="logs/app_plain.log" filePattern="app_plain.log.%i" append="false" immediateFlush="true" bufferSize="262144">
        <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg %ex%n"/>
        <Policies>
          <SizeBasedTriggeringPolicy size="1GB"/>
        </Policies>
        <DefaultRolloverStrategy fileIndex="max" min="1" max="100" compressionLevel="3"/>
    </RollingRandomAccessFile>
    <RollingRandomAccessFile name="json-log" fileName="logs/app.log" filePattern="app.log.%i" append="true" immediateFlush="true" bufferSize="262144">
        <PatternLayout pattern='{"@timestamp":"%d{ISO8601}","thread":"%t","level":"%p","logger":"%c","description":"%m %ex","correlation_id":"%mdc{correlationid}","headers_data":%hd,"endpoint":"%mdc{endpoint}","environment":${env},"application":"${application}","application_version":"${app-version}","type":"log","host":"${host}","data_version":2}%n'/>
        <Policies>
          <SizeBasedTriggeringPolicy size="1GB"/>
        </Policies>
        <DefaultRolloverStrategy fileIndex="max" min="1" max="100" compressionLevel="3"/>
    </RollingRandomAccessFile>
  </Appenders>
  <Loggers>
    <Logger name="com.example.core" level="debug" additivity="false">
      <AppenderRef ref="console" level="info"/>
      <AppenderRef ref="json-log"/>
      <AppenderRef ref="plain-log"/>
    </Logger>
    <Root level="info">
      <AppenderRef ref="console"/>
      <AppenderRef ref="json-log"/>
      <AppenderRef ref="plain-log"/>
    </Root>
  </Loggers>
</Configuration>
RollingRandomAccessFile has PatternLayout specified in JSON format with the necessary keys. Here headers_data is a key with a custom converter pattern %hd. This pattern is defined in a class HeadersDataConverter.java as follows.
package com.example.logger;

import org.apache.logging.log4j.core.LogEvent;
import org.apache.logging.log4j.core.config.plugins.Plugin;
import org.apache.logging.log4j.core.pattern.ConverterKeys;
import org.apache.logging.log4j.core.pattern.LogEventPatternConverter;
import org.apache.logging.log4j.util.ReadOnlyStringMap;

import com.example.logger.bean.RequestHeaderData;

/** headers_data converter pattern */
@Plugin(name="HeadersDataConverter", category="Converter")
@ConverterKeys({"hd", "headersData"})
public class HeadersDataConverter extends LogEventPatternConverter {

    protected HeadersDataConverter(String name, String style) {
        super(name, style);
    }

    public static HeadersDataConverter newInstance(String[] options) {
        return new HeadersDataConverter("requestHeader", Thread.currentThread().getName());
    }

    private RequestHeaderData setHeaderData(LogEvent event) {
        ReadOnlyStringMap ctx = event.getContextData();
        RequestHeaderData hd = new RequestHeaderData();

        hd.setAccept(ctx.getValue("accept"));
        hd.setAcceptEncoding(ctx.getValue("accept-encoding"));
        hd.setAcceptLanguage(ctx.getValue("accept-language"));
        // ...
        hd.setxPoweredBy(ctx.getValue("x-powered-by"));
        return hd;
    }

    @Override
    public void format(LogEvent event, StringBuilder toAppendTo) {
        toAppendTo.append(setHeaderData(event));
    }
}
The RequestHeaderData is a Java bean which can be serialized with an overrided toString() method that marshalls object to string using ObjectMapper
package com.example.logger.bean;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.databind.PropertyNamingStrategy;
import com.fasterxml.jackson.databind.annotation.JsonNaming;

import java.io.Serializable;

/** headers_data bean */
@JsonIgnoreProperties(ignoreUnknown = true)
@JsonNaming(PropertyNamingStrategy.SnakeCaseStrategy.class)
public class RequestHeaderData implements Serializable {

    private static final long serialVersionUID = 3559298447657197997L;
    private final ObjectMapper mapper = new ObjectMapper();

    private String accept;
    private String acceptEncoding;
    private String acceptLanguage;
    // ...
    private String xPoweredBy;

    public RequestHeaderData() {}
    
    // Generate getters and setters. Eclipse or any other IDE can do that for us.

    @Override
    public String toString() {
        String str = "";
        try {
            str = mapper.writeValueAsString(this);
        } catch (Exception ex) {}
        return str;
    }
}
SnakeCaseStrategy is the conversion strategy used which automatically converts all camelcase words to underscore ones. Overrides can be specified using @JsonProperty("override_string_here"). That is all there is. Specifying packages="com.example.logger" in the log4j2.xml will allow us to use the HeadersDataConverter plugin registered with hd as one of the pattern converter keys.

Now we have logs in the format
{"@timestamp":"2018-02-08T18:40:07,793","thread":"main","level":"INFO","logger":"com.example.web","description":"Service started. ","correlation_id":"","headers_data":{"accept":null,"accept_encoding":null,"accept_language":null,"cache_control":null,"client_ip":null,"correlationid":null,"connection":null,"content_length":null,"content_type":null,"dnt":null,"host":null,"remote_addr":null,"request_method":null,"path_info":null,"pragma":null,"query_string":null,"true_client_ip":null,"url":null,"upgrade_insecure_requests":null,"user_agent":null,"via":null,"x_forwarded_for":null,"x_forwarded_host":null,"x_forwarded_port":null,"x_forwarded_proto":null,"x_orig_host":null,"x_powered_by":null},"endpoint":"","environment":localhost,"application":"appName","application_version":"0.0.1","type":"log","host":"localhost","data_version":2}

The project.clj should contain the following dependencies.
; ...
; logging
[org.clojure/tools.logging "0.4.0"]
[org.apache.logging.log4j/log4j-core "2.9.0"]
[org.apache.logging.log4j/log4j-api "2.9.0"]
[org.apache.logging.log4j/log4j-slf4j-impl "2.9.0"]
; custom json logging
[com.fasterxml.jackson.core/jackson-core "2.9.2"]
[com.fasterxml.jackson.core/jackson-annotations "2.9.2"]
[com.fasterxml.jackson.core/jackson-databind "2.9.2"]
[org.slf4j/slf4j-api "1.7.24"]
; ....
:source-paths ["src"]
:test-paths ["test"]
:java-source-paths ["src-java"]
:javac-options ["-target" "1.8" "-source" "1.8" "-Xlint:unchecked" "-Xlint:deprecation"]
; ...
tools.logging is the Clojure library which provides macros that delegates logging to the underlying logging library used (log4j2). The slf4j-api is an interface that can work with different logging libraries and most well known libraries implement this. So any third-party library that uses a different logging library like logback will work. But we need a converter log4j-slf4j-impl which will capture all logs that works with SLF4J to be routed to log4j2. And since we defined a custom pattern, it works for all the logs. Simple it is.

The only caveat here is the custom pattern converter requires a well defined class. If the object is not know at compile time, as in if we are logging arbitrary JSON, then it is easier to extend the Layout instead.

ThreadContext (MDC)
ThreadContext is the local data that can be added to a particular thread in log4j2. SLF4j calls this MDC (Message Diagnostic Context). The point is, when a server gets a request which is handled by a thread or handed over to subsequent threads, any logs that happens during the execution of that request be identified with some unique identifier so that we can easily correlate all the logs for that particular request. Further more, if we have multiple services, we can correlate them using a unique correlationId if set. This can be done by setting appropriate values in the thread local context map.

Let's see how to do this with Aleph server in Clojure.
(ns com.example.web
  "Web Layer"
  (:require [aleph.http :as http]
            [manifold.stream :as stream]
            [compojure.core :as compojure :refer [GET POST defroutes]]
            [compojure.response :refer [Renderable]]
            [ring.middleware.params :refer [wrap-params]]
            [ring.middleware.keyword-params :refer [wrap-keyword-params]]
            [clojure.core.async :as async]
            [clojure.java.io :as io]
            [clojure.tools.logging :as log])
  (:import [org.apache.logging.log4j ThreadContext]))

(extend-protocol Renderable
  manifold.deferred.IDeferred
  (render [d _] d))

(defn say-hi [req]
  {:status 200
   :body "hi"})

(defmacro with-thread-context [ctx-coll & body]
  `(do
    (ThreadContext/putAll ~ctx-coll)  ;Set thread context
    ~@body))

(defn wrap-logging-context [handler]
  (fn [request]     
    ;; Set request map and other info in the current thread context
    (ThreadContext/putAll (merge {"endpoint" (:uri request)
                                  "remote-addr" (:remote-addr request)
                                  "query-string" (:query-string request)}
                                  (:headers request)))
    (handler request)))

(defn http-response [response options]
  (ThreadContext/clearAll)  ;Clears thread context
  response)

(defn wrap-http-response
  {:arglists '([handler] [handler options])}
  [handler & [{:as options}]]
  (fn 
    ([request]
      (http-response (handler request) options))
    ([request respond raise]
      (handler request (fn [response] (respond (http-response response options))) raise))))

(defn say-hi-handler [req]
  (let [ctx (ThreadContext/getContext)]  ;Get current thread context
    (stream/take!
      (stream/->
        (async/go
          (let [_ (async/<! (async/timeout 1000))]
            (with-thread-context ctx
              (say-hi req))))))))

(defroutes app-routes
  (POST ["/hi/"] {} say-hi-handler))

(def app
  (-> app-routes
      (wrap-logging-context)
      (wrap-keyword-params)
      (wrap-params)
      (wrap-http-response)))

(defn -main []
  (http/start-server #'app {:port 8080})
  (log/info "Service started."))
Here we wrap the ring handler with wrap-logging-context middleware which will set the request map to the server thread handling the particular request. Since we use aleph async threads for each Compojure routes, we need to pass the context to these threads. For that we get the context ctx in the say-hi-handler and use with-thread-context macro to do the job. That's all there is to logging with thread context.

Sidenote: Getting log4j2 to read the config is a big pile of mess when building a standalone jar because of Clojure, Java interop and compilation nuances. Makes me hate everything in this universe.

Thursday, 18 January 2018

Rapid Prototyping in Clojure with boot-clj

I find boot-clj to be great in doing rapid prototypes. It can be considered analogous to GroovyConsole. We can dynamically add dependencies, write, modify code, run, experiment all within a single file without having to create a project as with default lein setup. Create a folder for experiments and add boot.properties in it.
#https://github.com/boot-clj/boot
BOOT_CLOJURE_NAME=org.clojure/clojure
BOOT_VERSION=2.7.1
BOOT_CLOJURE_VERSION=1.8.0
Then we can create out prototype files, say pilot.clj with the below example template.
#!/usr/bin/env boot

;; To add some repository
(merge-env! :repositories [["clojars" {:url "https://clojars.org/repo/" :snapshots true}]])

(defn deps [new-deps]
  "Add dependencies to the namespace."
  (merge-env! :dependencies new-deps))

;; Add clojure
(deps '[[org.clojure/clojure "1.8.0"]])

;; Require
(require '[clojure.string :as str])
;; Import
(import '[javax.crypto.spec SecretKeySpec])

(println (str/upper-case "hi"))  ;; HI 
For faster startup of the boot-clj, add the following to the shell profile (.zshrc). Tune according to the machine.
# boot-clj faster startup
export BOOT_JVM_OPTIONS='
  -client
  -XX:+TieredCompilation
  -XX:TieredStopAtLevel=1
  -Xmx2g
  -XX:+UseConcMarkSweepGC
  -XX:+CMSClassUnloadingEnabled
  -Xverify:none'