Black Swans

July 9, 2016

I have just started “reading” Black Swan by Nassim Nicholas Taleb, I just the term “reading” loosely as I’m not quite reading the book, I bought the Audiobook version.What is a Black Swan

According to the book, a black swan is an event that satisfiles the following criteria.

  • It is unexpected to the observer.
  • It has a major event after it first occurs
  • It is rationalized by hindsight as if it could be expected. The event that Taleb uses for the name sake could be considered a black swans.

From Roman times till the 16th century, Europe only had white swans. By all empirical evidence at the time, all swans were white. By the 16 the century, it was considered that to have a black swan was impossible. It went into common language at the time, something was “as impossible as a black swan”. Then Dutch explorer Willem de Vlamingh became the first European to see a black swan in Western Australia and then brought a pair back to Europe. This changed the perception and subsequently, the saying changed to show later “perceived” impossibility might later be shown to be disproven.

I haven’t finished the book yet, but it does talk about trying to avoid black swans that are bad and exploiting those that are good.

My thoughts for this are, for example, Netflix with their chaos monkey that randomly kills servers is one possible way to try to avoid negative black swans. By causing unpredictable events, learning from them and then trying to prevent them, Netflix’s systems become more resilient to actual black swan events such as unexpected disruptions.

On the other hand, exploiting the black swan can be seen as experimentation. Trying something different, something new and if it works, then exploitng it, and innovating.

The book “Adapt: Why Success Always Starts with Failure : Tim Harford” also goes down this line, where constant experimentation, trying new things increases your chances of finding something new and exciting. A positive black swan.

This can be seen through the Unicorn companies such as Netflix, Facebook, Etsy, Google, Amazon etc. These companies continuously try something new. They deploy new functionality multiple times per day. Get feedback and then determine if they need to keep the new functionality, drop it or modify it and try again.

On Audiobooks

I am finding it a little difficult to actively read an audiobook. By actively read, I mean take notes while reading. I find it easier to do with a written book as opposed to a spoken book. I find the audiobook reading a little more passive. Hopefully enough will sink in that I will be able to produce more posts.


FizzBuzz

October 27, 2015

A little while ago, I had a discussion with one of my colleges with regards to documentation. His argument was that the best documentation was the code. My argument was contrary to that.

I thought I would discuss my thoughts here.

Te get things started, lets say you have the below code…

package com.beanietech.fizzbuzz;

public class FizzBuzz {

    /*
     Extra fizz for your buzz!
     */
    public static int fizzbuzz(int fizz, int buzz) {
        while (buzz != 0) {
            int awesome = (fizz & buzz);

            fizz = fizz ^ buzz;

            buzz = awesome << 1;
        }
        return fizz;
    }

}

I’ve changed the variable names to make it a little obscure, but given its only half a dozen lines of code, can you tell me in 10 seconds what it does?

My guess is that you can’t. I couldn’t.

I could write a word document that describes the functionality, it could say…

Given 2 numbers, the two numbers are multiplied together and the result is returned.

Ah, now you say, I know what it does!

Are you sure? Word documents can be incorrect or out of date. You cannot rely on them.

So what is next?

This is where my argument comes in. My thoughts are that the best documentation is the tests. Not the code, and not the description of what the document does.

So lets write a test.

import com.beanietech.fizzbuzz.FizzBuzz;
import org.junit.After;
import org.junit.AfterClass;
import static org.junit.Assert.assertEquals;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;

/**
 *
 * @author hpaff
 */
public class TestFizzBuzz {
    
    public FizzBuzz fizzBuzz;
    
    
    public TestFizzBuzz() {
    }
    
    @BeforeClass
    public static void setUpClass() {
    }
    
    @AfterClass
    public static void tearDownClass() {
    }
    
    @Before
    public void setUp() {
        //fizzBuzz = new FizzBuzz();
    }
    
    @After
    public void tearDown() {
    }

    
    
     @Test
     public void testFizzBuzz1() {
         assertEquals(4,FizzBuzz.fizzbuzz(1, 1) );
     }
     
     @Test
     public void testFizzBuzz2() {
        assertEquals(4,FizzBuzz.fizzbuzz(2, 2));
     }
     
     @Test
     public void testFizzBuzz3() {
           assertEquals(6,FizzBuzz.fizzbuzz(3, 3));
     }
     @Test
     public void testFizzBuzz4() {
           assertEquals(8,FizzBuzz.fizzbuzz(4, 4));
     }

}

As you can see, the tests clearly show what the functionality is.

When we run it, We get confirmation.

Test

So now we know that the method adds two numbers.

Now I hear you say, “Tests are not very user friendly, how is a business person suppose to understand this?”

Well, this is where my second part of this discussion comes in.

There is an open source project called Concordion that can help with this.

Concordion allows you to specify the functionality as examples into an HTML document. It uses the “<span>” tag to mark out sections of the document to be executed.

So, for our little example, our HTML specification would be:

<!DOCTYPE html>
<!--
To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
-->
<html xmlns:concordion="http://www.concordion.org/2007/concordion">
    <head>
        <title>Test FizzBuzz Concordion</title>
    </head>
    <body>
        <h1>Test FizzBuzz with Concordion</h1>
        
        <p>If the Fizz is <span concordion:set="#fizz">1</span> and the Buzz is <span concordion:set="#buzz">1</span> then the FizzBuzz should be <span concordion:assertEquals="fizzBuzz(#fizz, #buzz)">2</span></p>
        
        <p>If the Fizz is <span concordion:set="#fizz">2</span> and the Buzz is <span concordion:set="#buzz">2</span> then the FizzBuzz should be <span concordion:assertEquals="fizzBuzz(#fizz, #buzz)">4</span></p>
        
        <p>If the Fizz is <span concordion:set="#fizz">3</span> and the Buzz is <span concordion:set="#buzz">3</span> then the FizzBuzz should be <span concordion:assertEquals="fizzBuzz(#fizz, #buzz)">6</span></p>

        <p>If the Fizz is <span concordion:set="#fizz">4</span> and the Buzz is <span concordion:set="#buzz">4</span> then the FizzBuzz should be <span concordion:assertEquals="fizzBuzz(#fizz, #buzz)">8</span></p>

    </body>
</html>

Which renders to:

Test FizzBuzz with Concordion

If the Fizz is 1 and the Buzz is 1 then the FizzBuzz should be 2

If the Fizz is 2 and the Buzz is 2 then the FizzBuzz should be 4

If the Fizz is 3 and the Buzz is 3 then the FizzBuzz should be 6

If the Fizz is 4 and the Buzz is 4 then the FizzBuzz should be 8

All very readable.

The next step is to add a fixture.  A fixture is a bit of code that links the HTML page to the system you are trying to test.

For this simple execution, the fixture is:

import static com.beanietech.fizzbuzz.FizzBuzz.fizzbuzz;
import org.concordion.integration.junit4.ConcordionRunner;
import org.junit.runner.RunWith;

@RunWith(ConcordionRunner.class)
public class FizzBuzzFixture {
   
  public int fizzBuzz(int fizz, int buzz){
      return fizzbuzz(fizz, buzz);
  }

}

Now, when we execute the fixture, we get:

fizzbuzzconcordion

As you can seem the results are green. If for example, we got something wrong, we would get.

fizzbuzzconcordionerror

So, if we ties this into a continuous integration suite such as Jenkins or Bamboo, we should always have up to date documentation that has been verified and human readable.

Now, if we need to add more tests, we no longer need to write code for this particular scenario. Why? Because we have decoupled “What we are testing” from “How we are testing“.

The fixture controls how we test the method, but the HTML page controls what we are testing. So, to add more tests, provided we haven’t changed any functionality, we just need to modify the HTML page.

Finally, in this example I used Concordion. But there are other tools out there such as Fitnesse which is wiki based, Cucumber and JBehave which is Text file based and numerous others.

 


Lockpicking

December 24, 2013

Now for a post of something different.

I have played around with picking simple locks. Mainly picking the locks of filing cabinets at the office – not for personal gain, but usually because I (or someone in the team) has forgotten their key, or as a prank.

This is probably a skill most office workers should know about. Firstly, it shows how easily those filing cabinets can be opened. Secondly, it helps if the key is lost. Rather than drill out the lock, you can just pick it.

How most locks work

Most locks work on a tumbler system.

The following diagram shows a cross section of a lock. The tumblers (Blue and Green) are held in place by springs that push them up to their default positions. The tumblers are in 2 sections. Where they break is where they line up to open the lock.

The top part of the lock in this diagram is the barrel. This is what turns when you turn the key.

Lock - No Key

The tumbers are in 2 sections. When the sections align, like when a key is inserted.

Lock - Key

you can turn the barrel and the lock opens. Fairly simple.

Get Your Tools Ready

To pick the lock, you basically have to get those tumblers to line up so you can turn the barrel.

To do this, you need 3 tools.

The first is a lever. This is what puts pressure on the lock to turn the barrel. For this, I use a jewelers screwdriver – largest size.

The second is the pick. This is what is used to push the tumblers down. For this – in the office, I use a large paperclip. The normal sized paper clips are too easily bent. Another thing you can use is a hairpin.

The third is a pair of needle nose pliers. This is to bend the pick to shape.

To bend the pick, you want the paperclip to be straightened out enough to get into the lock. And then add a little hook bend in the end. Only a small bend is necessary.

For example:

Pick bend

That’s it. That is all the tools I use.

Pick the Lock

Now to pick the lock.

Take your screwdriver and place it in the lock as close to the middle as possible, but leaving enough room to get the pick into the section where the key’s teeth usually go.

Turn the screwdriver a little in the direction that the key would normally turn. Here you are putting shear pressure between the barrel and the tumblers. Don’t put too much as you may either bend your screwdriver, or break the lock. You only want a little more force than you would use to turn the key.

Insert your pick and slowly push down the tumblers. This can be done by just raking the pick across the tumblers. Pushing them down will cause them to get to the point where there is the break. The imperfections in the lock will allow you to hold the tumbler in place while you do the next one rather than have it spring back into its default position.

Eventually, the breaks in the tumblers will all line up and the lock will turn.

This method only works with cheap locks as the imperfections in the lock allows the tumblers to be held in place with a little pressure. Higher quality locks have better machining so, the tumblers spring back into place more easily.


Version your builds in Netbeans

November 28, 2013

I’ve been playing around with Netbeans building tools to help with my management of our WebMethods implementation and I needed to version my library.

After a bit of searching, this is what I found. This will add the date to the manifest file in the jar build which you can then read from your java code.

Note: This is a quick and dirty way to get version information. It does not retrieve versions from git or subversion or any other VCS.

First things first, go into your netbeans project folder and modify the build.xml file with the following entries.

<target name="-pre-init">
 <property name="project.name" value="My Library" />
 <property name="version.num" value="1.4.1" />
 <tstamp>
 <format property="now" pattern="yyyy.MM.dd"/>
 </tstamp>

 <manifest file="MANIFEST.MF">
 <attribute name="Bundle-Name" value="${project.name}" /> 
 <attribute name="Bundle-Version" value="${version.num}" />
 <attribute name="Bundle-Date" value="${NOW}" />
 <!--<attribute name="Bundle-Revision" value="${svna.version}" />-->
 <attribute name="Implementation-Title" value="${project.name}" />
 <attribute name="Implementation-Version" value="${now}" />
 </manifest>
</target>

This will add the entries into your manifest file. The <tstamp> tags set the date format which we are using for our version number.

The following is a sample manifest file (Some entries removed)

Manifest-Version: 1.0
Ant-Version: Apache Ant 1.8.3
Created-By: 1.6.0_27-b07 (Sun Microsystems Inc.)
Class-Path: <ClassPath>
Bundle-Name: My Library
Bundle-Version: 1.4.1
Bundle-Date: 2013-11-28 11:08:23 EST
Implementation-Title: My Library
Implementation-Version: 2013.11.28
Main-Class: au.com.ebs.wm.IntegrationServer.IntegrationServerList

To access the version in your code , create the following method:

public class version {
 public String getVersion(){
 String version = this.getClass().getPackage().getImplementationVersion();
 return version;
 }
}

Performance Issues

May 5, 2012

The Problem
I came across an interesting problem this week with an old eGate SRE system. For a couple of years now, every now and then, the eGate system would perform very badly under heavy load. The system has been running for more than 10 years, burt a major release was added 2 years ago. Each time the performance issue happened, we would check the system and try to find out where the bottlenecks were. Each time, we couldn’t find anything. We would get the UNIX guys involved, they would check CPU, Disk, Memory and Network, all the classic symptoms of a stressed out application and they would find that the system was hardly using any resources. Typically with performance issues with eGate, you have very little tuning options for the system and basically your only move is to replicate components to try to increase parallel throughput. It turns out, that doing this only compounded our issues.

This week, the age old problem occurred again, but this time the person looking on the Unix side found something else.
Our system was locking a particular .ssc file in the monk_scripts/common/.lockdir directory. For those that don’t know what an .ssc file is, its the event type definition file of a message structure. In other words, it defines the structure of a message in the monk programming language in eGate SRE.
This particular .ssc file was being used by the auditing system. In the design of this eGate implementation, every component sends an audit event to a queue which then allows us to track the flow of each message through the eGate implementation. Very handy in trying to determine where a message failed in the process flow.

So, what was happening? Well, whenever a component tried to audit a message flowing through it, it would lock this particular .ssc file as it constructed the audit message. In doing so, it prevented any other component from Auditing while the lock was present. Thus every other component would wait until the lock was released. Now the lock isn’t long. Maybe 10 ms, but given that this particular etd was being used by over 90 components, some of these components multi threaded, you see the issue. You get contention and most of your components are waiting for this particular file rather than processing messages.
This is effectively a classic race condition, and our previous attempts to try to increase throughput by increasing component was actually slowing down the system more.

I can now see you wondering why we were locking files in the first place. Here is the kicker, our code wasn’t doing the locking. It was the underlying code of the eGate framework. So we had no way of knowing this was happening. Not without help from the Unix team.

Finding the problem
So, how did the Unix guy find the file being locked.
He used the command truss -p pid
He then looked for a line that contained the word EGAIN
This would bring up a line similar to fcntl(53,F_SETLK, 0xFEA72AC8) Err#11 EGAIN
Where the 53 is the line descriptor of the file being used.

To find the file, he used pfiles pid this would bring up a list of files being used by the process and he would check which file used line descriptor 53, which was in the left column.

The Solution
To try to resolve the issue, the first thing we did was to try to find out where that file being locked was being used and we found out it was being used in a monk function that was being called by every component.
Checking the monk function, the .ssc file was being loaded every time the function was called. I should note at this point, that this eGate system has been running in production for over 10 years and this code was written in 2002.
So the first fix was to only have the .ssc loaded into memory if it hadn’t been previously loaded.
Our thinking too was that if the file isn’t being loaded, it shouldn’t be locked.
This turned out not to be the case. Performance remained just as bad.

The second attempt was to segregate groups of components to use their own version of the .ssc file. One set of replicated components would use Version A, another group would use Version B and so forth.
This solution also didn’t work because the contention was happening at the first set of components that used the monk language in volume. The contention was at that point. So segregating out at that point only shifted the problem to the new .ssc.
Other components didn’t have that issue as much as processing was slowed down past this component that the locking didn’t have a significant impact. But, what we did find, while making the changes in a live (argh!) system was at some point performance increased significantly, then dropped after we completed our changes. Why, because at one point only 1 or 2 components were sharing a particular .ssc. Performance increased from processing about 500-600 messages every half hour to several thousand during this change, but quickly dropped back to about 1000-1200 per half hour after the change. Still not a good enough performance for the backlog that was happening.

The next attempt at a solution was to reduce the number of components running that used the .ssc. We reduced the number of components down to 2 for the first point from 8. Again performance increased slightly to about 1500-1700 messages per half hour. Unfortunately overnight, our load inbound to the system was at around 2000-3000 messages per half hour peak and then dropped to about 1500 messages steady. Our queue depth was dropping, but ever so slowly.

The next solution which I implemented at about 4am after checking the overnight stats on the backlog’s progress or should I say lack of, and with a normal business days worth of transactions fast approaching, I made the decision to remove the call to the Audit system on successful processing of messages. Failed processing of messages still audited.

The Results
I implemented the change on a single component. The effect was immediate. About 500 messages had been processed by that one component in 10 minutes. So I immediately did the same to a second component to check the results. That component also processed a significant number of messages within a short time frame.
At that point in time, I decided to leave the system till I got into work at 7am.
Checking the logs when I got into work, the two components I had changed processed about 30,000 messages between them in a 45 minute period.
We still had a backlog, but it had progressed to the next component in the flow. So we made the same change at that component, and so forth and the backlog cleared very quickly.

We have now isolated the issue to the audit function and a workaround has been implemented which is to not use the function that caused the issue in the first place.
Monday’s work is going to be to try to re-write the audit function to not use the .ssc file and try to re-integrate it back into the system and fingers crossed not cause another performance impact.

It just goes to show, that performance issues can occur in the most unlikely thought of places. Without the help of a Unix team member who went above and beyond looking at the regular suspects of disk, memory, cpu and network, but actually went in and checked what was happening at the process level, we would never have had the lead that found the root cause.
There is nothing in the documentation of the product, and having someone from the vendor come in and have a look which management very much like to do at every issue would not have helped either – I know because I worked as an employee with this particular vendor as a consultant and implementer for this particular product for over 10 years.


Connecting to Java MQ through JNDI Programmatically (Repost)

January 4, 2012

This is a repost from http://blogs.sun.com/jcapsuser/entry/connecting_to_java_mq_through

The following document goes through how

to set up JNDI for a queue in Java MQ and then gives you source code to read and write to the queue. This source can be used as a basis to send messages to Java CAPS for automated testing within your Java CAPS implementation.

I have only tried the program on my localhost, so I’m not sure how well these instructions will hold up for remote hosts. When I get a chance to try it out and getting it working, I’ll update this document.


Build Your Own?

November 23, 2011

From Hollis Tibbetts blog at soa.sys-con.com, he says:

Build Your Own Integration Stack?
— “DIY”, “homegrown” or “hand-coded” is the most commonly used method of data and application integration. It’s also almost always a terrible idea.
I remember back in the mid/late 1980’s talking to IT departments about the concept of a relational database, and why having one on their VAX would be a good idea.
In many cases, my recommendation was met with some head scratching, brow furrowing and comments along the lines of “why would be buy one of those, when we can just build our own data storage system using RMS records?”
100 years later, it’s a pretty rare occurrence that someone would decide to build their own data storage and manipulation system – if they do, it’s because of some unusual requirement. As far as I know, nobody considers “build” as the default strategy in this area.

I came across a customer a few years ago where for one project they had written their own integration engine despite having bought one. The one I and a colleague were there to fix an issue with and talking to each other about the project that wrote their own integration engine and we came to the same conclusion. Why would you build one from scratch? If you don’t want the licensing cost, there is always the option of open source. There is Apache Servicemix, Apache camel if you need something lightweight, Mule and even OpenESB which is still alive and kicking and I’m sure there are plenty of others. Even a few open source ESBs use other open source ESBs as a base.
The only reason to build your own from scratch in the Enterprise I can see is:

  • For the experience. You might want to know how an engine works or see how to implement one in another language such as scala, python or clojure.
  • There is a feature you require that isn’t present in any other engine.

What do you think?


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: