October 13, 2008

Book Review: The Definitive Guide to Terracotta

Terracotta is a “transparent clustering technology” that allows you to make data structures available across a cluster of machines in a highly-scalable and robust manner. Unlike many other clustering solutions (including the very popular memcached), it doesn’t expose an API that a developer leverages to push data structures in and out of a big distributed container. Rather, it’s a library that’s boot-strapped into your JVM while the behavior is driven by an XML config file. This allows for sharing data in the fields of a class across the cluster as well as synchronized access to objects, just like in any multithreaded application. Terracotta is able to do this through some very interesting decoration of bytecode as Java classes are loaded into the JVM. What this ultimately allows for is something like a large shared memory heap shared by all JVMs which can survive JVM crashes since all data is also written to disk. Additionally, since Terracotta doesn’t use a peer-to-peer approach of data replication, it’s easier to achieve linear scalability.

Sound interesting? Learn more at the Terracotta web site.

This is an excellent book. The prose is well-written and engaging and the book flows very well from section to section. There are massive amounts of Java code and configuration such that you very rarely have to picture anything in your mind, you can just read it there on the page. There are helpful diagrams where appropriate. It’s unlikely that a reader with a good understanding of Java will become confused at any point during the book. It’s informative and provides some excellent examples of real world use, including especially chapters on integration with things like like Spring, Hibernate, session replication and more. There is also an extensive chapter on using Terracotta to create a Master/Worker compute grid. If you’re looking to learn more about Terracotta I really can’t recommend this book enough: it helped fill so many gaps that I had after skimming some of the documentation and reading a few of the white papers.

The only real negative is that the book is slow to get started. The first two chapters (~40 pages) serve as an introduction and history to the technology respectively but taken together, it’s just a very lengthy introduction that rehashes a lot of the same concepts. Maybe this was tedious for me given that I’d already read a lot of documentation on Terracotta but it just seemed like the intro could have been a bit shorter.

I’ll now run down the other chapters in the book and detail some that I found particularly interesting.

Chapter 3 is a quick jump into the framework and some tooling while Chapter 4 gets into the nitty-gritty details of POJO clustering. This is important to read to understand how Terracotta does what it does. Chapter 5 talks about how to do caching and this is where you start to understand the real world problems that can be solved using the tool. Your database will thank you!

Chapter 6 is where it gets really interesting. Here you will learn how to use Terracotta as a 2nd-level cache provider for Hibernate to significantly boost performance over using something like Ehcache . More startling than that is a proposed architecture where the notion of POJO clustering is used to effectively put data structures that hold detached objects (those not attached to an active Hibernate Session). You are shown how to change application code that uses the Hibernate API in a “typical” fashion to achieve performance increases measured in orders of magnitude. This is truly an eye-opener.

Chapter 7 shows you how to cluster HTTP Sessions and how you can be freed from some of the annoying restrictions of the Servlet container Session API, such as implementing Serializable and religiously using setAttribute(). ; This is the sort of thing you can plug into an existing application very quickly and realize enormous scalability gains.

Chapter 8 is about clustering Spring beans. Spring and Terracotta follow a very similar philosophy in that they are non-invasive frameworks. As such, they compliment each other very nicely. This chapter shows how easy it is to cluster Spring beans: even easier perhaps than clustering POJOs as in Chapter 4. At this point, if you are a user of Spring and Hibernate, you’re starting to see how easy it can be to achieve seriously scalability and performance improvements.

Chapter 9 talks about Terracotta Integration Modules which is a sort of package that provides additional features to the Terracotta core: this is how integration with Hibernate and Spring are achieved. Chapter 10 gives an extended treatment of thread coordination, showing how well-written multithreaded code can be used with Terracotta to achieve thread coordination across multiple JVMs. Chapter 11 takes this further to detail the Master/Worker pattern for computing grids. Chapter 12 rounds things out by showing the visualization tools that can be used to monitor and debug an app using Terracotta.

As I said before, this is a great book. If you’re interesting in scaling out enterprise Java applications, you owe it to yourself to check out Terracotta and this book.

August 29, 2008

ActionScript Annoyances (Part 1)

Every time I encounter something in AS3 that annoys me, I'm going to compel myself to blog about it, with the hope that it will filter through the Internet ethos to the desk of some of the Flash Player engineers in San Francisco.

Can't use static constants as default values for function parameters

Example:

public class PropertyType {
    public static const LISTING_RESIDENTIAL : String = "ListingResidential";
    ...
}

public function createSearch(propertyType : String = PropertyType.LISTING_RESIDENTIAL) : void {
    ...
}

Results in this compiler error:

1047: Parameter initializer unknown or is not a compile-time constant.

Of course the bigger complaint here might be that there are no enumerated types in ActionScript... but that's probably asking too much. :)

June 21, 2008

Integrating Blaze Data Services and Spring Security

One thing to keep in mind with the out-of-the-box security support in Blaze DS is the approach to integration is container-specific: there is support for Tomcat (and therefore JBoss), WebSphere, Weblogic and Oracle through various implementations of the LoginCommand interface. Unfortunately if you have custom security requirements for authentication that means you're dealing with a lot of cumbersome, container-specific security configuration and/or writing and configuring JAAS plugins. The authorization support in Blaze DS is limited to specifying which roles have access to a particular destination which isn't nearly flexible enough.

Fortunately Spring Security provides solutions to many common problems in Java EE space, including features like container portability, a flexible authentication provider model, authorization of service method invocation via AOP and even some very cool ACL support to enforce granular security at the domain object level. Integrating Spring Security with Blaze DS isn't as hard as you think either: I was able to bang out a quick proof of concept over a weekend.

The config for Spring Security 2 is quite straightforward with the new XML namespace support in your Spring config files:

<security:http>
    <security:form-login>
</security:form-login>

This is basically a very stripped-down configuration since things like RememberMeServices (using cookies) don't usually apply in a Flex-based RIA. You can also throw in a very simple AuthenticationProvider like this one from the SS2 docs:

<security:authentication-provider>
    <security:user-service>
        <security:user name="jimi" password="jimispassword" authorities="ROLE_USER, ROLE_ADMIN"/>
        <security:user name="bob" password="bobspassword" authorities="ROLE_USER"/>
    </security:user-service>
</security:authentication-provider>

There are two items you need to add to your web.xml to bootstrap SS2 in a Servlet container:

<filter>
    <filter-name>springSecurityFilterChain</filter-name>
    <filter-class>org.springframework.web.filter.DelegatingFilterProxy</filter-class>
</filter>

<filter>
    <filter-name>securityContextAwareFilter</filter-name>
    <filter-class>org.springframework.security.wrapper.SecurityContextHolderAwareRequestFilter</filter-class>
</filter>

<filter-mapping>
    <filter-name>springSecurityFilterChain</filter-name>
    <url-pattern>/*</filter-mapping>
</filter-mapping>

<filter-mapping>
    <filter-name>securityContextAwareFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

The first filter is standard part of any Spring Security configuration. The "SecurityContextHolderAwareRequestFilter" adapts SS2 to the Servlet environment so that calls like getPrincipal() and isUserInRole() behave as expected. This really comes in handy when you have other code in your projects that assumes a "standard" Java security setup.

Now we need a little config in the Blaze services-config.xml file:

<security>
    <login-command class="net.histos.util.spring.SpringSecurityLoginCommand" server="Tomcat"/>
    <security-constraint id="valid-user">
        <auth-method>Custom</auth-method>
        <roles>
            <role>ROLE_USER</role>
        </roles>
    </security-constraint>
</security>

Blaze DS seems to require that a "server" attribute be specified for any LoginCommand even though this isn't really used in our implementation. The security-constraint isn't necessary if you are going to use SS2's service method invocation authorization support. However if your security requirements are more straightforward you can do role/destination based restrictions here. Then simply add this element to the appropriate destinations:

<security>
    <security-constraint ref="valid-user"/>
</security>

The last part is some Java code. If you extend AppServerCommand you get a default impl for this method:

protected boolean doAuthorization(Principal principal, List roles, HttpServletRequest request) throws SecurityException

This method makes use of isUserInRole(), so by adding the servlet filter referred to above, this logic can work without any modification required. This leaves only two methods in your LoginCommand impl:

public Principal doAuthentication(String username, Object credentials) {
    log.debug("doAuthentication");
    // get the ProviderManager from app context
    Map<string, providermanager=""> map = getContext().getBeansOfType(ProviderManager.class);
    if (map.size() != 1)
        throw new RuntimeException("Spring ApplicationContext must contain exactly one ProviderManager bean");
    ProviderManager provider = map.get( map.keySet().iterator().next() );
    // authenticate
    String password = extractPassword(credentials);
    Authentication auth = provider.authenticate( new UsernamePasswordAuthenticationToken(username, password) );
    SecurityContextHolder.getContext().setAuthentication(auth);
    return auth;
}

public boolean logout(Principal principal) {
    log.debug("logout");
    SecurityContextHolder.getContext().setAuthentication(null);
    return true;
}

Those are the basics! As I said, this a proof of concept that I haven't had time to test extensively yet but it should get you started! One other note: I noticed while testing the Flex side that calling login() or logout() on a RemoteObject without calling a "regular" service method would result an error; apparently the ChannelSet hadn't been defined yet. I did a little digging and found that apparently you're supposed to call login() or logout() on the underlying ChannelSet itself. I wrote a very simple ChannelSet implementation that can be easily instantiated in MXML and bound as the channelSet property for your RemoteObjects:

package net.histos.flex.util
{
    import mx.messaging.ChannelSet;
    import mx.messaging.channels.AMFChannel;

    public class SimpleChannelSet extends ChannelSet
    {
        public function set url(channelUrl : String) : void {
            addChannel(new AMFChannel("defaultChannel", channelUrl));
        }
    }
}

<util:SimpleChannelSet id="channelSet" url="http://localhost:8080/testapp/messagebroker/amf"/>

Then simply invoke login() and logout() on the ChannelSet itself and you should have no problems.

June 09, 2008

Railo and JBoss gets crickets at The Server Side

Unfortunately Railo going OSS as part of JBoss has gotten nothing more than crickets over at The Server Side.  The reception to other dynamic languages such as JRuby or Groovy has been lukewarm at TSS so it'll be interesting if CFML is able to make better inroads.

June 08, 2008

Open Source ColdFusion? BlueDragon and Railo

The month of June has been significant with the news of the Railo CFML server joining the JBoss ecosystem and New Atlanta ready to formerly announce the open source BlueDragon CFML server at the CFUnited Conference.  It's interesting to consider what this will mean for the future of CFML and obviously Adobe's official implementation of CF.

We now have two reasonable implementations of ColdFusion that are now free and open source.  It's very likely that this might help spur adoption of CF which has been fairly stagnant for a number of years.  Despite the supposedly strong sales numbers that Adobe routinely trumpets (although rarely articulates with hard numbers), job postings for CF are flat in an industry that continues to grow.  Hopefully the accessibility that an open source CF will offer will make more folks pick it up and see what it's all about.

There's also a good chance that this will drive Adobe to make CF's pricing more competitive.  Given that CF Standard sells for $1299 now it's hard to believe they would start to give that away for free.  Perhaps the price will be pushed down significantly or Adobe will make a free version available while stripping out additional features from pro.  At the end of the day, having access to a cheap version of the "real" CF can't be a bad thing.

The biggest concern has to be one of forking.  BlueDragon has for some time decided to implement their own tags or functions in the language, some of which were followed by Adobe while others were not.  Railo has also started to add their own features which I haven't seen in BD or CF.  If Adobe introduces a third version of the product that will mean at least five different deployment environments.  Is this going to be a problem?  Or will the choices just mean that the reach of CFML will expand?

In some ways this reminds me of the Java universe with its numerous implementations of the Servlet specification and/or the full Java Enterprise Edition stack.  The difference here is that these technologies are based on well-defined standards and specs that anyone is free to implement.  Most of the commercial vendors will add their own extensions to "add value" and justify their hefty license fees but the spec is implemented rather reliably across the board, giving you freedom to switch between vendors if you so desire.

Does this mean that a CF specification might be in the future?  Will Adobe cozy up to Railo to tag-team BlueDragon?  It'll be interesting to see what's in store for the second half of 2008.