2014/03/08

back to the beginning ... construct

I had a couple of constructive discussions after my last post. A couple of doubts were raised about the reality of what I said.

There's only one way to answer those doubts and that is by showing you. So when I was revising some stuff for a customer earlier this week, I reconstructed one component for discussion here. Before I go into the details (and show you where you can find it), allow me to say a couple of words about APIs on the Internet.


Well, like most of the Apis mellifera family, APIs on the Internet have quite a sting but unlike bees, they regularly sting again (wasp like), as anybody trying to keep up with the Google, Facebook, Twitter, Dropbox, <you name it>, APIs can attest to.

However, for all their perceived flaws (which I won't go into) they are a step towards a programmable web and I deal with them on a frequent basis. What I typically do in order to use them is construct a component that :
  1. Shields the complexity.
  2. Constrains the usage.
  3. Improves the caching.
As an example, I created the urn.com.elbeesee.api.rovi module which is now available on Github. The Rovi (http://developer.rovicorp.com) API provides metadata for movies and music products.

Note that I only provided the source, if you want to use it you'll have to build the Java code. If this is making too much of an assumption on my part, contact me and I'll walk you through, no problem. If I get lots of requests, I'll blog about that next time.

You'll notice that the module provides two accessors, one - active:rovimd5 is private and computes the signature necessary to make requests. The other one active:rovirelease is public and takes a upcid as argument and provides access to the Rovi Release API.

In order to use active:rovirelease it needs to be able to find three resources when you issue a request to it, rovi:apikey, rovi:secretkey and rovi:expiry.

The first two are obvious and it is obvious why I'm not providing those in the module itself. The third one may be less obvious, but you'll note the following in the code :

rovireleaserequest.setHeader("exclude-dependencies",true); // we'll determine cachebility ourselves
 

When making the actual request to Rovi I ignore any caching directives that come my way. And on the response I do the following :

vResponse.setExpiry(INKFResponse.EXPIRY_MIN_CONSTANT_DEPENDENT, System.currentTimeMillis() + vRoviExpiry);

Two questions that can be raised about this are :
  1. Why are you doing this ?
  2. Is this legal ?
The Why is easy to answer. It is my business logic that should decide how quickly it wants updates, not a 3rd party API that wants to make money out of my requests. 

The legal aspect is not so clear and you should carefully read what the terms of usage are. Note however that I am not persisting any results from the API, I'm just letting the business logic dictate how long they are relevant in memory (and since memory is limited the distribution of the requests will determine which results remain in memory and which do not).

Adding persistence would not be very hard, however especially for paying services you then need to be fully aware of the terms of usage. Contact me for details if you want to know how to add a persistence layer.

Another takeaway from this module is that I throttle active:rovirelease. Granted, this is maybe also something that shouldn't be done in there (as it may depend on your business model) but controlling the flow is an important aspect of using APIs and this is a - simple - way to do it.

A last takeaway is that I don't interpret the result of the request in this component, nor do I force it into a format of any kind. And while I will grant that adding some form of handling could be useful after the actual API call it is an important takeaway. This component shields the API. What is done with the result belongs in another component.

This component is used in production. It is reality. You'll also find that it doesn't contain any kind of magic or clever coding (or a lot of coding at all). And yet it accomplishes quite a few things. The main thing it accomplishes is that it turns an API not under your control into a resource that is.

2014/01/27

back to the beginning ... context

I gather that Thor (aka Peter Rodgers) approves of my Back To The Beginning series. Let me tell you, it is hard going and not a post goes by without me making a couple more assumptions than I wanted and creating a couple more loose ends that I need to tie up. It is mostly context, ROC/NetKernel is what I have been breathing in the past six years. When I look at other evolutions in the IT field (yes, I do) they look as alien to me as ROC/NetKernel might look to you. It's all a matter of point-of-view, of context.

This brings me to a question I got about my last post. Actually, two questions.

The first one was : "Why are you not showing a lot more code ? Things remain at the child's play stage without it, a real application has a lot more code !". While I disagree with the statement, that is a good observation that deserves an answer.

I'm not easily offended and I had a very good email discussion with the person that made the observation. Feel free to contact me if you have questions too !


ROC development is not code-centric. It is about constructing small components (endpoints) that you can compose together and constrain as needed. If you know about shell scripting, you are already familiar with this concept. A lot of components (for example, I used an XSLT transformation in an earlier post) are already available.

The components should be made such that they do one task, one task only. For myself I use the following rule-of-thumb ... if a component requires more than 200 lines of source code, comments and logging included, I have not been thinking things through enough.

Tony Butterfield gave me the following thought exercise to come to grips with it. Would the Linux ls command be a good candidate for a ROC component ? The answer is ... no. The core of the command (directory listing) is fine but it has too many formatting and filtering options. By putting those options in the same component you would take away the chance for the ROC system to optimize them. They should have been in different components.

So the reason there hasn't been a lot of code in this series shows the reality of developing in ROC/NetKernel, it is not me trying to avoid complexity.

The second question was : "What is this aContext-thingie in your Java code ?". Ah, right, oops, I didn't actually discuss coding in the ROC world at all yet, did I ?

Well, to start, if it runs in the JVM, you can potentially use it to ROC code in. In practice, I find that a combination of Java and Groovy is ideal. Note that I wasn't formally trained in either. Note that I am pretty proficient in Python (and Jython is an option in NetKernel).  However, if 200 lines are all I'm going to write for a given component, I'm not going to require wizardry in any language, right ? So I decided to use Java for writing new components (since NetKernel was developed in it, it is closest to the metal) and I use Groovy as my composition language.

I am quite serious, I can't stand up to any seasoned J2EE developer in a "pure" Java coding challenge and I have great respect for their skills. However, if I'm allowed to use ROC/NetKernel to solve the problem I will go toe to toe with the best.

Writing a new component always follows these four steps :
  1. What have I promised that I will provide ?
  2. What do I need in order deliver my promise ?
  3. Add value (the thing that makes this special).
  4. Return a response.
When a request of an endpoint is made, you are handed the context of that request to take those steps. This context allows you to pull in the arguments to the request, for example :

String theargument = context.source("arg:theargument",String.class);


It allows you to create new (sub)requests in order to add value, for example :

INKFRequest subrequest = context.createRequest("active:somecomponent");

subrequest.addArgumentByValue("theargument", theargument);
String theresult = (String)context.issueRequest(subrequest);

And finally it allows you to return a response :

context.createResponseFrom(theresult);

When using Java you are given slightly more control over things, however the HelloWorldAccessor with the onSource method from the introduction module is a good starting point, we'll discuss different types of endpoints and verbs in a later post (loose ends again, I know). The same thing in Groovy would look like this :

import org.netkernel.layer0.nkf.*;
import org.netkernel.layer0.representation.*
import org.netkernel.layer0.representation.impl.*;


// context
INKFRequestContext aContext = (INKFRequestContext)context;

//

aContext.createResponseFrom("Hello World");

Due to the way a Groovy program is instantiated you are dropped straight into the equivalent of the onSource method in a Java accessor. Also, the assignment of context to aContext is strictly speaking not necessary, it is a coding practice that allows me to see things correctly in my editor (Eclipse). In any of the available scripting languages you'll always have context available.

So ... why are these two (the Java accessor in the introduction module and the Groovy program above) good starting points but actually bad components ? Because they don't add value, the response is a static string, I could just as well - and did in the introduction module - define a literal resource.

Food for thought, no ?

2014/01/03

back to the beginning ... logging

In the second half of the 1990s I was an IDMS database administrator for Belgium's biggest retailer. When our resident guru almost got killed by the job I got most of the main databases in my care ... and I must admit I ruled supreme. If you've never heard of the BOFH, check him out here and here. I don't know if any of those stories are based on reality, but they are nothing compared to some of the stuff I pulled off.

I hated logging.

Let me place that statement in the correct context. PL/I did not allow asynchronous actions, logging ate processing time. Also, disk storage was not cheap, the estimated costs of storage could and would often kill a project before it even started. Database storage was even more expensive. Migration to tape was an option but it made consultation afterwards very impractical.

This brings me to the why of logging. Opinions may differ but I see only two reasons :
  • Audit
  • Bug fixing by means of postmortem analysis
Audit. You want to know what was requested and - if you're a security auditor - who requested it. This is a - and in my view the only - legitimate reason for logging. However, even back then tools existed that allowed auditing without having to put it in the code. A fellow database administrator that was a bit too curious about the wages of the others and looked them up on the database found that out the hard way.

Bug fixing by means of postmortem analysis. You want to know what the state of the system was at the time of an error. This requires excessive amounts of logging. It did back then and it does today. And let me tell you something ... it's never enough.

You might say I'm an old fart that's not up to speed with the current state of technology. Storage is very cheap, asynchronous logging has become the standard ... and doesn't everybody say that you should log everything because it can be big stash of useful data itself ?

As a matter of fact, they - whoever they are - don't. They mean audit data collected on the edges of your system, not the things you'd typically put in a put skip list or System.out.println.


I still hate logging. And therefore I was very happy that when NetKernel 4 was released, it contained a back-in-time machine. Such was the power that 1060 Research also released a backport for NetKernel 3. This time machine is also known as the Visualizer. When running, it captures the complete (!) state of any request in the NetKernel system. Anybody that has worked with it agrees that it is sheer magic, for you can pinpoint any error almost immediately. Such a powerful tool warrants its own blogpost, so that's for next time.

All personal opinion aside, how does one log in NetKernel then ? Well, lets see how we can add some to our introduction module. First I want to have some audit of the incoming requests. We could write our own transparent overlay - another topic for a future blogpost - for this, but as it happens the HTTP Jetty Fulcrums have everything that's needed.


Open [installationdirectory]/modules/urn.org.netkernel.fulcrum.frontend-1.7.12/etc/HTTPServerConfig.xml in your favorite text/xml editor.
Remove the 
<!--Uncomment for NCSA Logging 
and the matching 
--> 
line. You can also change the settings and/or the name of the logfile. Restart NetKernel (this change is not picked up dynamically). You should now find a new logfile under [installationdirectory]/log.

Now try http://localhost:8080/introduction/helloworld-file in your browser. Open up the logfile and you should see something like this (given date, locale and browser differences) :

0:0:0:0:0:0:0:1 -  -  [03/jan/2014:09:42:35 +0000] "GET /introduction/helloworld-file HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0"

That's audit taken care of. Do I have to mention the fact that this logfile can easily be picked up by tools like Splunk and Logstash ?

If you need help with integrations like that, feel free to contact me. I've been there and done that.

Maybe you think I'm full of shit in my rant above or maybe you just have to comply with the rules and regulations of a customer. If so, yes, you can also do explicit logging in NetKernel.

Since the HelloWorldAccessor is the only piece of code we have, it's in there that we'll add it. The onSource method looks like this :

    public void onSource(INKFRequestContext aContext) throws Exception {
        aContext.createResponseFrom("Hello World");
    }


Adding logging is as simple as :

    public void onSource(INKFRequestContext aContext) throws Exception {
        aContext.logRaw(INKFLocale.LEVEL_INFO,"Logging from HelloWorldAccessor");
        aContext.createResponseFrom("Hello World");
    }


You'll notice the logging appears in two places. Firstly in the standard output of the NetKernel process :

I 11:00:34 HelloWorldAc~ Logging from HelloWorldAccessor

Secondly in [installationdirectory]/log/netkernel-0.log :

<record>
  <date>2014-01-03T11:00:34</date>
  <millis>1388743234821</millis>
  <sequence>278</sequence>
  <logger>NetKernel</logger>
  <level>INFO</level>
  <class>HelloWorldAccessor</class>
  <thread>167</thread>
  <message>Logging from HelloWorldAccessor</message>
</record>


Why is this ? Well, the log methods look for a configuration resource. Either you pass this resource in the method, or the resource res:/etc/system/LogConfig.xml is used (if it can be found), or - as a final resort - [installationdirectory]/etc/KernelLogConfig.xml is used. Check it out, it has two handlers.

So, to preempt a couple of questions, if you want a different log for each application, you can. If you want a JSON formatted log, no problem. Another common request these days (for yes, I am up to speed) is that the log messages themselves have to be formatted.

In order to do that, your module requires a res:/etc/messages.properties file resource. An actual file yes, logging is provided at such a low level that not all the resource abstractions are in place yet. The file can contain things like :

AUDIT_BEGIN={"timestamp":"%1","component":"%2", "verb":"%3", "type": "AUDIT_BEGIN"}
AUDIT_END={"timestamp":"%1","component":"%2", "verb":"%3", "type": "AUDIT_END"}


In your code you can then write :

aContext.logFormatted(INKFLocale.LEVEL_INFO,"AUDIT_BEGIN",         System.currentTimeMillis(), "HelloWorldAccessor" , "SOURCE");

And the results look like this :

I 11:42:03 HelloWorldAc~{"timestamp":"1388745723359","component":"HelloWorldAccessor", "verb":"SOURCE", "type": "AUDIT_BEGIN"}

and :

<record>
  <date>2014-01-03T11:42:03</date>
  <millis>1388745723360</millis>
  <sequence>263</sequence>
  <logger>NetKernel</logger>
  <level>INFO</level>
  <class>HelloWorldAccessor</class>
  <thread>176</thread>
  <message>{"timestamp":"1388745723359","component":"HelloWorldAccessor", "verb":"SOURCE", "type": "AUDIT_BEGIN"}</message>
</record>


Again I'll mention the fact that these logfiles can easily be picked up by tools like Splunk and Logstash. And there you have it, a complete - and customizable - logging system. To close my post I'm going to talk about the loglevels for a moment, NetKernel provides :
  • INKFLocale.LEVEL_DEBUG
  • INKFLocale.LEVEL_FINEST
  • INKFLocale.LEVEL_FINER
  • INKFLocale.LEVEL_INFO
  • INKFLocale.LEVEL_WARNING
  • INKFLocale.LEVEL_SEVERE
Not only can these be easily matched to ITIL aware operations systems, you can also turn them off and on in NetKernel itself. This will allow you to safe quite a bit of storage ... you never know when that might come in handy ;-).

2013/12/31

interesting times

If you're looking for the episode on logging, soon my friend, soon. 
If you've noticed my last post of 2012 was also titled interesting times, you are correct.

A year ago I promised Tony Butterfield that I would force a breakthrough for Resource Oriented Computing / NetKernel in 2013. With about eight hours left in 2013 here in Brussels, Belgium, I must admit defeat as far as that claim goes.

People that know me personally know that I do not take defeat - in any area - lightly or graciously. 

It has been a good year but not the exceptional year that I envisioned when I made that claim.

So, here are some of my ROC related targets for 2014 :
  • At least one ROC backed game playable in one of the mainstream social networks.
  • Positioning ROC as the best solution for Linked (Open) Data projects.
  • Positioning ROC as the green IT solution. With the point of no return for climate change set in 2015, computing must start contributing too.
  • A new book.
  • <any target you, my constant readers, care to set ?>
  • ...
Best wishes and see you in 2014 !
 

2013/12/11

back to the beginning ... unit test

In IT a lot of fuss is made these days around testing and logging. The former is used to avoid introducing bugs into software and the latter is used to be able to find the inevitable bugs that are introduced into software.

And no, the above is not said in a light or satirical tone. I wouldn't dare, the industry has gotten quite anal on these subjects, as - to give one example - Rick Hickey found out when he compared Test Driven Development (TDD) to learning to drive by bumping into the guardrails. Testing and logging are to be taken very serious and they have become big moneymakers in their own right.

This does say a lot about the IT industry as a whole, but I'm not going there either ...

Today I'm going to discuss testing within the Resource Oriented Computing worldview, logging will be in the spotlight next time.

Why test ?
  • unit test : to check if a piece of code delivers on its promise 
  • integration / system test : to check if a piece of code works in combination/interaction with other pieces of code in a complex system
  • regression test : to check if a piece of code still works after a change has been introduced elsewhere in a complex system 
You probably have other definitions in your textbook, mine go back to my initial IT training in the early nineties. Yes, I am that old.

When you're working with ROC/NetKernel, you'll soon notice that there's less fuss about testing. Reasons :
  • less code
  • smaller components
  • less complexity
Another - maybe less obvious - reason is that constraints are typically applied at the end of the Resource Driven Development (RDD) cycle - more on Construct Compose Constrain in a later post - not at the beginning. RDD and TDD approach the same problems from opposite sides.

Time to discuss the testing framework in NetKernel. Yes there is one and yes, it works with resources. Hey, what did you expect ? Last time I created a service for the Amazon Web Service tools. You can find the source for that module, urn.com.amazonaws.sdk on Github.

You'll find three spaces in the module.xml. The public urn:com:amazonaws:sdk space that contains the service, the private urn:com:amazonaws:sdk:import space that contains the internally used resources and tools and the public urn:com:amazonaws:sdk:unittest space. Can you guess what that last one is for ?

There is an ongoing debate whether or not the unit tests should be included in a module or live in a separate module. As with the naming of spaces (I've gotten some comments on that) ... whatever suits you best. If you are adamant about not deploying tests to production, you might want to split them off. If you want to have tests for your private spaces - as I do - you must include them.

The testing framework works - like the fulcrums - with a dynamic import. It searches for public spaces with a res:/etc/system/Tests.xml resource and pulls those spaces inside.

It is a bit confusing given the tag-names, but the resource describes groups (!) of tests. Some people prefer one group for the whole module, I usually have one group per space that I want to test :
            <tests>
                <test>
                    <id>test:amazonaws:sdk:import</id>
                    <name>amazonaws sdk import unittest</name>
                    <desc>amazonaws sdk import unittest</desc>
                    <uri>res:/resources/unittest/sdk-import.xml</uri>
                </test>
                <test>
                    <id>test:amazonaws:sdk</id>
                    <name>amazonaws sdk unittest</name>
                    <desc>amazonaws sdk unittest</desc>
                    <uri>res:/resources/unittest/sdk.xml</uri>
                </test>
            </tests>

 

Each group contains the uri of the resource with the actual list of tests for that group, in this case res:/resources/unittest/sdk-import.xml and res:/resources/unittest/sdk.xml. Lets look at the former :
            <testlist>
                <test name="SOURCE regions.xml resource">
                    <request>
                        <identifier>res:/resources/xml/regions.xml</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <notNull/>
                    </assert>
                </test>
            </testlist>

 

And the latter :
            <testlist>
                <test name="SOURCE s3 eu-west-1 resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/s3-eu-west-1</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <stringEquals>s3-eu-west-1.amazonaws.com</stringEquals>
                    </assert>
                </test>
            </testlist>


These test-lists can contain as many tests as you deem necessary. In this case I have one in each. I check if regions.xml is there and I check one case of the service.

If you've been properly indoctrinated you'll probably wince in agony at the above. Only two tests ? What about some more cases ? Bear with me, if you want to add more, you can.

Let us have a look at the XUnit Tests tool first.

I filtered the results so you only see the two relevant test-lists. If you want to run a lot of tests at once, select the test-lists you want to run :

and select execute :

Or you can select one test-list with view :

and execute that :

Note that this is not only a GUI, it is quite easy to add this to an automation and get the test-results as XML documents. Everything is a resource !

I'm not going to run through all the XUnit Testing documentation, but lets add a couple more tests so you all feel a bit more at ease. First in res:/resources/unittest/sdk-import.xml :
            <testlist>
                <test name="SOURCE regions.xml resource">
                    <request>
                        <identifier>res:/resources/xml/regions.xml</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <notNull/>
                    </assert>
                </test>
 
                <test name="SOURCE regions.dat resource">
                    <request>
                        <identifier>res:/resources/xml/regions.dat</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <exception/>
                    </assert>
                </test>

            </testlist>
 
If you then run the tests for that list this is what you get :

and next in res:/resources/unittest/sdk.xml :
            <testlist>
                <test name="SOURCE s3 eu-west-1 resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/s3-eu-west-1</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <stringEquals>s3-eu-west-1.amazonaws.com</stringEquals>
                    </assert>
                </test>
 
                <test name="SOURCE s4 eu-west-1 resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/s4-eu-west-1</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <exception/>
                    </assert>
                </test>

             </testlist>

If you then run the tests for that list this is what you get :

Oops ... that wasn't quite what I intended. What went wrong here ? Select exec to see the actual result of the request :
The grammar of the res:/amazonaws/baseurl/s4-eu-west-1 request is valid, so the regions.xml resource is passed through the xslt transformation and comes up with unknown as the result. A valid result. Not an exception. So lets change our test-list again :
            <testlist>
                <test name="SOURCE s3 eu-west-1 resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/s3-eu-west-1</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <stringEquals>s3-eu-west-1.amazonaws.com</stringEquals>
                    </assert>
                </test>
 
                <test name="SOURCE s4 eu-west-1 resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/s4-eu-west-1</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <
stringEquals>unknown</stringEquals>
                    </assert>
                </test>

                <test name="SOURCE blabla resource">
                    <request>
                        <identifier>res:/amazonaws/baseurl/blabla</identifier>
                        <verb>SOURCE</verb>   
                    </request>
                    <assert>
                        <exception/>
                    </assert>
                </test>

            </testlist>

And the result :

Much better. And here's my opinion : the extra test I added to res:/resources/unittest/sdk-import.xml is useless as it doesn't test functionality, the two extra tests I added to res:/resources/unittest/sdk.xml are useful as they do test functionality. It may not always be so clear and if you feel you've got to err on the side of caution, by all means do. Just remember that your purpose is solving problems, not writing tests.

Right, that's a lot of information again. Don't forget to look through the XUnit Test documentation, you'll find that you can adapt and extend it to your own preferences.

As I promised, next time I'll discuss logging and as a bonus I'll also explain what that Limiter endpoint (did you notice it in the unit test space ?) does. Keep on ROCing in the mean time !

2013/11/29

back to the beginning ... mapper pattern

It was to be expected, the moment I try to take things from the top, there is a major update to the NetKernel administrative interface (aka Backend Fulcrum). If you haven't done the update yet, you should, it is astonishing. I personally have a blind spot in the user experience area (being a command line jockey) but even I can see this is a huge improvement.
No, I'm not going to update my prior posts with new screen shots. Things have been cleaned up and integration between the tools is now much tighter. However, there is no change to what lies underneath.

An overhaul of that magnitude would in most products warrant a major release change, however, the crew @ 1060 Research is too honest and rated the two-week effort with only a minor release. 

In my previous three posts I looked at the core of Resource Oriented Computing. In this and couple more posts I am going to discuss a couple of very common patterns. Today the spotlight is on the mapper.

== Intermezzo (you can safely skip this) BEGIN ==

Before I embark on that quest - creating a real service as I do so - I want to spend a moment on what is known as the magic/silver bullet fallacy. Also known as the deus ex machina fallacy :
  • There is no such thing as a free lunch (no, not even in the Google/Apple/... offices). You can safely ignore any paradigm that claims it will solve problems effortlessly. That goes against the second law of thermodynamics. ROC claims no such thing. What it does claim is that it allows you to solve problems a lot more efficiently !
  • I can explain it to you but I can't understand it for you. I got this bon mot from the Skype status of a very smart developer. Ironically he didn't quite understand ROC, but that is not the point. The point is that effort is involved.
Why am I saying this ? Because the service I'm going to create in this blog involves the use of an XSLT transformation and stylesheet. Maybe you master this technology, maybe you don't. You don't have to master it to follow the example, but my point is that you need some tools in your IT toolbelt. NetKernel comes with a lot of tools in the box (and you can add your own too). It will take effort to master them.
== Intermezzo (time to pay attention again) END ==

Amazon Web Services has grown into a large platform with a plethora of services (ec2, s3, ...). You can integrate these into your own solutions using the SDK or an HTTP service. In the latter case, each service has its own URL, which also depends on the region that you want to use the service in.

I'm ignoring the SDK for now. Integrating 3rd party libraries is a topic for another post.

In order to keep track of all these URLs, Amazon has published a regions.xml resource. As you can see, it contains all the URLs for all the services in all the regions. Woot !

But what if I just want to know what the Hostname is for the S3 service in the eu-west-1 region ? Well, that's the service I am going to build. It will define this resource : res:/amazonaws/baseurl/{service}-{region}.

For a first iteration we are going to download the regions.xml file and include it into our module itself as a file resource, res:/resources/xml/regions.xml.
    <rootspace
        name="amazonaws sdk import"
        public="false"
        uri="urn:com:amazonaws:sdk:import">
       
        <fileset>
            <regex>res:/resources/xml/.*</regex>
        </fileset>
    </rootspace>


The regions.xml file is an XML document. Processing an XML document can be done in many ways, I like using XSLT which takes the data and applies a transformation stylesheet to it. So we are also going to require a stylesheet, regions.xsl, which is also going to be a file resource, res:/resources/xsl/regions.xsl. We could define another fileset for that, but with a small change to the above we can cater for both :
    <rootspace
        name="amazonaws sdk import"
        public="false"
        uri="urn:com:amazonaws:sdk:import">
       
        <fileset>
            <regex>res:/resources/(xml|xsl)/.*</regex>
        </fileset>
    </rootspace>


The stylesheet itself looks like this :
<xsl:stylesheet
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:nk="http://1060.org"
    exclude-result-prefixes="nk">
   
    <xsl:output
        method="text"
        encoding="UTF-8"
        media-type="text/plain"/>

    <xsl:param name="service" nk:class="java.lang.String"/>
    <xsl:param name="region" nk:class="java.lang.String"/>
   
    <xsl:template match="/XML">
        <xsl:if test="Regions/Region[Name=$region]/Endpoint[ServiceName=$service]/Hostname">
            <xsl:value-of select="Regions/Region[Name=$region]/Endpoint[ServiceName=$service]/Hostname"/>
        </xsl:if>
        <xsl:if test="not(Regions/Region[Name=$region]/Endpoint[ServiceName=$service]/Hostname)">
            <xsl:text>unknown</xsl:text>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>


There, that's that. NetKernel comes with a lot of tools for processing and transforming data and since its roots lie in the XML era, it is no surprise that an XSLT transformation engine is readily available. It is part of the xml:core toolset. Including xml:core changes the above space to this :
    <rootspace
        name="amazonaws sdk import"
        public="false"
        uri="urn:com:amazonaws:sdk:import">
       
        <fileset>
            <regex>res:/resources/(xml|xsl)/.*</regex>
        </fileset>
       
        <import>
            <!-- active:xsltc -->
            <uri>urn:org:netkernel:xml:core</uri>
        </import>
    </rootspace>


Right, as you probably deduced, I've put our bill-of-material in a private space, as I don't want any of it publicly available. Before we get to the public part, lets see for a moment what we can do in this space :
The full declarative request is :
<request>
    <identifier>active:xsltc</identifier>
    <argument name="operand">res:/resources/xml/regions.xml</argument>
    <argument name="operator">res:/resources/xsl/regions.xsl</argument>
    <argument name="region">data:text/plain,eu-west-1</argument>
    <argument name="service" >data:text/plain,s3</argument>
    <representation>java.lang.String</representation>
</request>


And no, I haven't done anything strange or new with that (although it is a new - and very welcome - feature of the Resource Trace Tool), it is the equivalent of :
active:xsltc+operand@res:/resources/xml/regions.xml+operator@res:/resources/xsl/regions.xsl+region@data:text/plain,eu-west-1+service@data:text/plain,s3

Which is certainly a level up from a simple resource request like res:/resources/xml/regions.xml (try that first if you're uncomfortable with the above) but you can see how it is actually merely pulling together four resource requests as input for the active:xsltc resource request. This process is called composing.

If you've looked at the documentation for active:xsltc, you're might wonder why I have the extra "region" and "service" arguments. Well, they are varargs (variable arguments) which in this case are "optional parameters provided to the stylesheet runtime". You'll find them used in the stylesheet.
 
What is left is to map our proposed resource res:/amazonaws/baseurl/{service}-{region} to the above request in a public space :
    <rootspace
        name="amazonaws sdk"
        public="true"
        uri="urn:com:amazonaws:sdk">
       
        <mapper>
            <config>
                <endpoint>
                    <grammar>
                        <simple>res:/amazonaws/baseurl/{service}-{region}</simple>
                    </grammar>
                    <request>
                        <identifier>active:xsltc</identifier>
                        <argument name="operand">res:/resources/xml/regions.xml</argument>
                        <argument name="operator">res:/resources/xsl/regions.xsl</argument>
                        <argument method="as-string" name="region">[[arg:region]]</argument>
                        <argument method="as-string" name="service">[[arg:service]]</argument>
                        <representation>java.lang.String</representation>
                    </request>
                </endpoint>
            </config>
            <space>
                <import>
                    <uri>urn:com:amazonaws:sdk:import</uri>
                    <private/>
                </import>
            </space>
        </mapper>
    </rootspace>


While this may not all be immediately clear, it should be clear what happens. An incoming resource request of the form res:/amazonaws/baseurl/{service}-{region} gets mapped to an active:xsltc resource request which is issued into the wrapped space. The response representation goes the other way.

The method="as-string" attribute turns values into resources. Don't worry about that too much at this point. I could have specified those arguments as follows :
   <argument name="region">data:text/plain,[[arg:region]]</argument>
   <argument name="service">data:text/plain,[[arg:service]]</argument>

but then I'd have had to import the Layer1 space as I explained in my post about spaces.

Check out the result :

If you remember from last time (extra credits to you !) how to make this available in your browser as http://localhost:8080/amazonaws/baseurl/s3-eu-west-1 ... you've just created a real and useful service. Congratulations !


Having regions.xml as a file in your module is fine, but it would be better if we could use the regions.xml that Amazon has published. You can (this will require internet access to work). First replace 
  <argument name="operand">res:/resources/xml/regions.xml</argument>
with
  <argument name="operand">https://raw.github.com/aws/aws-sdk-java/master/src/main/resources/etc/regions.xml</argument>

Then redeploy the module and check out the result :

The problem is that your space doesn't know yet how to resolve a http:// resource. Not a problem though, NetKernel has a toolkit available that does know :
    <rootspace
        name="amazonaws sdk import"
        public="false"
        uri="urn:com:amazonaws:sdk:import">
       
        <fileset>
            <regex>res:/resources/(xml|xsl)/.*</regex>
        </fileset>
       
        <import>
            <!-- active:xsltc -->
            <uri>urn:org:netkernel:xml:core</uri>
        </import>
       
        <import>
            <!-- http:// scheme  -->
            <uri>urn:org:netkernel:client:http</uri>
        </import>

    </rootspace>


Redeploy, try again and you'll get the same result as before.


Summary
  • We've looked at an interesting pattern today, the mapper pattern. 
  • We've composed resources together into new resources, we wrote no custom code.
  • We've made use of tools provided in NetKernel.
  • We have build a real service.
It is a lot of information and you may feel that I'm taking a huge leap forward from where I left off last time, but do take it one step at a time and you'll see that is not the case. Enjoy !

2013/11/16

back to the beginning ... transports

Today I'm making good on my promise. We're going to make the res:/introduction/helloworld-xxxxx resources available in your browser. If the host you are running NetKernel on is open to the Internet, we might even make them available to the whole world !

I've had some questions about it so I want to repeat one point I made. The last two posts did cover the core of Resource Oriented Computing ! There really is no more. You do have all the building blocks and not unlike those in Lego they are simple. There is however no - practical - limit to what you can build with them.


Before we start making changes to our introduction module, I first have to explain about transports. NetKernel provides a Resource Oriented ... abstraction or box or - my personal view as a former System Administrator - operating system and until now we were only able to request the resources inside that with the Request Resolution Trace Tool. I bet that is not how you want to run your software. Transports solve this problem. A transport listens for incoming requests and brings them inside. The response (resource representation) goes the other way.

That may sound complex but again you are most likely already familiar with this concept. If you're a database administrator, you might know that an Oracle server listens on port 1521 for incoming database requests. The Apache server listens on port 80 for incoming http:// requests. The ssh daemon listens on port 22 for incoming ssh requests. Und so weiter.

As you can see thinking of NetKernel as an operating system was not so silly, for indeed, the transports it comes with work in exactly the same way. NetKernel has several transports available, out-of-the-box it has two running. On port 8080 and port 1060 it has Jetty (alternative for Apache) server based transports listening for incoming http:// requests.

The transport on port 8080 is also known as the FEF - Front End Fulcrum, the one on port 1060 as the BEF - Back End Fulcrum (or Administrative Fulcrum). For me personally this last acronym is confusing as BEF also stands for Belgian Franc (obsolete since 2002), I guess my age is starting to show.

You may already be able to deduce where this is going, these transports translate incoming http:// requests into res:/ requests.

For example :
  1. You request http://localhost:1060/tools/requesttrace in your browser.
  2. The transport listening on port 1060 translates this into a res:/tools/requesttrace request.
  3. The res:/tools/requesttrace resource can be found in the urn:org:netkernel:ext:introspect space (no worries about how I found that out).
  4. Therefore it must follow that the space that contains the transport - that would be urn:org:netkernel:fulcrum:backend - has access to (imports) the urn:org:netkernel:ext:introspect space.
We are going to proof the above by making our res:/introduction/helloworld-xxxxx resources available to the FEF.

We could just as well do it on the BEF, go ahead and try that as an exercise if you like !

The FEF lives in [installationdirectory]/modules/urn.org.netkernel.fulcrum.frontend-1.5.12. Change the module.xml file in there to import the urn:org:elbeesee:experiment:introduction space. Where ? Well, right underneath the comment :

  <!--Import application modules here-->
  <import>
    <uri>urn:org:elbeesee:experiment:introduction</uri>
  </import>

Save and try http://localhost:8080/introduction/helloworld-file in your browser.
Something went wrong. If you've been very attentive you may have noticed that the module.xml file of the FEF does not contain a <dynamic/> tag, changes to the module.xml file are not picked up. Stop and start NetKernel and try again.
Success. We have proved the concept.

However, if you're anything like me, it doesn't feel like success. Sure, it works, but that's quite a process to go through. Changing a file and stopping and starting NetKernel to deploy a resource to the FEF ? Anno Domini 2013 ? Really ? In fact, it was the process to follow in NK3. Things have changed.

You may undo your changes to FEF module.xml, stop and start NetKernel again (the introduction urls shouldn't work any longer) and then I want to draw your attention to this snippet :

  <!--Dynamic Imports-->
  <endpoint>

    <prototype>SimpleImportDiscovery</prototype>
    <grammar>active:SimpleImportDiscovery</grammar>
    <type>HTTPFulcrum</type>
  </endpoint>


What does that mean ? Well, it means that the FEF will automatically import spaces that contain a certain resource. So instead of changing the FEF, we're going to change our own module and add the following to the
urn:org:elbeesee:experiment:introduction space :

  <literal type="xml" uri="res:/etc/system/SimpleDynamicImportHook.xml">
     <connection>
       <type>HTTPFulcrum</type>
     </connection>
  </literal>

The uri and content of the resource are what the SimpleImportDiscovery endpoint expects. You can also provide the resource in any of the other forms that we have seen (a file resource is very common).

And you'll notice the introduction urls work once more.

Not only is this very convenient (and we take that for granted in this day and age), it is also a very powerful pattern that you can use for your own solutions !

This brings me to the end of this post. Before I go I'd like to make an important observation. It is very easy to forget - I know I did for quite a while - that other than the default (http://) transports can be used to trigger resource requests. There's a cron-transport. An ssh-transport. A mail-transport. And if all of these available transports don't fill your needs, it has been shown that any of the Apache Camel Components can be turned into a custom transport for NetKernel.