Colin Chaplin, Colin@Wordfish.co.uk
Microsofts Intelligent Application Gateway (IAG) includes a number of built-in application optimisers to secure and protect the applications you want to publish. It also features a high-performance search-and-replace engine which can filter web pages in-line and, coupled with URL rules, you can build your own complex application optimisations.
It's easy to build custom filters with IAG.
It's easy to build custom filters with IAG It’s easy and straightforward to create your own rules to meet your own business needs. This article explains how by delving into an example publishing Outlook Web Access (OWA) using labelling/metadata tags.
Like many organisations, Contoso.com makes use of metadata tags to assist with archiving, retention and security policies. These are implemented by the use of a label in the subject line of all emails. Exchange transport rules add labels if users do not apply them.
Due to the nature of Contoso.com’s business, they wish to prevent some emails being viewed from ‘untrusted’ workstations, for example, web cafes.
Contoso already uses IAG to provide access to OWA 2007. IAG already includes a number of powerful features that allows you to decide what a ‘trusted’ workstation is, for example, a registry key or anti-virus update level.
With Contoso, all IAG access will be from machines deemed untrusted, and therefore the following business rules need to apply:
This example takes you through the design, thought process and implementation of an application optimiser to satisfy the above business rules, and assumes a basic familiarity with IAG. The IAG example Virtual Machines available from the Microsoft website are a suitable candidate for following this example.
In order to create the rule to enable this functionality, we need to understand how the application operates. The best way to do this is to browse a number of sessions using a network sniffer like WireShark, Netmon, or Fiddler to understand the HTML and related syntax produced.
Let’s take this example of an OWA page:

A full capture of the underlying HTML is available here; the edited highlights are below:
<!-- Copyright (c) 2006 Microsoft Corporation. All rights reserved. -->
<!-- OwaPage = ASP.forms_premium_readmessage_aspx -->
<html dir="ltr">
<head>
<title>[FINANCIAL] Takeover plans</title>
</head>
<body class="rdFrmBody">
<div id=divThm style=display:none _def=8.0.685.24/themes/base/>
</div>
<textarea id="txtBdy" class="w100 txtBdy" style="display:none">
<div dir="ltr"><font face="Tahoma" color="#000000" size="2">Let's go buy litware inc</font></div> </body> </html>
</textarea>
</div>
</body>
</html>
The screenshot earlier shows the HTML produced when a page is requested. What we are looking for is a repeatable, common pattern so we can then write a regular expression to define what is acceptable. Examining the HTML syntax above (and testing against a number of scenarios), we can see a pattern forming that would meet our needs:
We also need to decide what we’re replacing the redacted text with. Ideally this should be as similar to the original page as possible to avoid script errors. For simplicity, in this example, the replacement text is to be:
<body class="frmBody">
<div id=divThm style=display:none _def=8.1.240.5/themes/base/>
</div>
The policy of your organisation does not permit access to this email from this location
</body>
</html>
To do this, you need to understand how to construct regular expressions (also called regex or regexps), and there’s lots of guides available on the web.
To test your regexp, copy and paste your HTML grab into your favourite text editor that supports regular expressions to search for text.
In this case, we want to achieve the following:
Search for the body content of a webpage, and if it doesn’t have [PERSONAL], [SOCIAL] or [LOWRISK] at the start of the subject line, then redact the body text (replace the text with something else)
Referring back to Step 1, we note that we want to begin and end the search looking for:
<TITLE>something</html>
This would translate into a regexp as:
<TITLE>.*</html>
However, this isn’t good enough; it would match ALL pages and mean they would ALL be redacted.
What we have to do now is define the strings that poison the search expression – in other words, if these terms are in the search string then do not match the string.
This is a bit of a leap, so we need to go back to our business rules:
As it stands, it’s difficult to write a regexp to cover these two rules. However, they could be re-written to say:
This is semantically the same but much easier to write into a regexp as it is one rule. It also ‘fails safe’ because if a new label is used, the page will not be displayed by default.
We now need to write the ‘unless’ part of our regexp. For this, we’ll use the fantastically titled ‘negative forward lookahead’, combined with a wildcard.
The negative-forward lookahead is represented as a ?! and can be thought of as a ‘NOT’. It would look like this:
(?![PERSONAL]|[SOCIAL]|[LOWRISK]).*
Which reads as:
Zero or more of any characters apart from [PERSONAL] or [SOCIAL] or [LOWRISK] at the start.
So, putting it all together we get:
<title>(?![PERSONAL]|[SOCIAL]|[LOWRISK]).*</html>
Finally, although in IAG we can define what pages this search-and-replace will act upon, customise the search and replace syntax for the exact pages you intend to filter on and, if possible, minimise any mishaps.
If we examine the code, we can see the following:
<!-- OwaPage = ASP.forms_premium_readmessage_aspx -->
So with a bit more experimentation, we can come up with:
.*readmessage.asp.*<title>(?![PERSONAL]|[SOCIAL]|[LOWRISK]).*</html>
Now that we’ve built our regular expression, we need to build it into IAG.
On your IAG, load the Editor from the Whale Communications IAG\Additional Tools Menu, then the file
C:\Whale-Com\e-Gap\von\conf\SRATemplates\WhlFiltSecureRemote_HTTP.xml
(or WhlFiltSecureRemote_HTTPS.xml if that’s what your portal uses)
To get an example of what we are going to do, search for
<SEARCH encoding="base64">
The first thing you’ll notice is that the search string is garbled: this is encoded in Base64 as it makes life easier because you do not need to escape control characters.
To decipher the text, click your cursor at the start of the encoded text, hold shift then use the right arrow key to select text until the </SEARCH> tag, then click ‘From 64’, as shown on the screenshot below.
Don’t use the mouse to click and drag, because it often helpfully tries to select the end of the text – and fails!

This then gives you an idea of the syntax required for our search-and-replace instruction:
<APPLICATION>
<APPLICATION_TYPE>application name</APPLICATION_TYPE>
<URL>
<NAME>URL for the pages required</NAME>
<SEARCH mode="regexparam" encoding="base64">search regexp – base64 encoded
</SEARCH>
<REPLACE encoding="base64">Replacement text base 64 encoded </REPLACE>
</URL>
</APPLICATION>
So, our example will be:
<APPLICATION>
<APPLICATION_TYPE>owa2007</APPLICATION_TYPE>
<URL>
<NAME>.*</NAME>
<SEARCH mode="regexparam" encoding="base64">.*readmessage.asp.*<title>(?!\[PERSONAL\]|\[SOCIAL\]|\[LOWRISK\]).*</html> </SEARCH>
<REPLACE encoding="base64"><body class="frmBody"> <div id=divThm style=display:none
_def=8.1.240.5/themes/base/></div> The policy of your organisation does not permit access to this email from this location </body></html> </REPLACE>
</URL>
</APPLICATION>
Note:
Select a location for your search-and-replace instruction and nestle it between an existing
</APPLICATION> and <APPLICATION> tag.

IAG is very, very unforgiving (and unhelpful) if you get any syntax wrong. One way to protect against this is to load the XML file in Internet Explorer and allow active content. This will highlight some syntactical errors.

If we pass the IE test, then it’s time to activate the configuration. To do this, we need to activate the IAG configuration, and ensure the ‘Apply Changes made to external configuration settings’ is ticked

You may encounter one of the following errors:
If so, it’s almost certainly to do with invalid syntax: are you sure you’ve Base64 encoded everything properly?
Now you have ironed out all the syntax errors, it’s time to test. Let’s look at a message that the business rules state we should not be able to see:

Note the IE Script error message; this is due to our simplified replacement text. Now, let’s double-click on another prohibited email (the email with subject “Label-less”) and examine the results:

Finally, let’s confirm all is OK by attempting to read the email we should be able to access:

Success! Your information security policy can be complied with, and you can chalk up one more victory with the assistance of IAG.
Although the example above takes a number of design, development and test short-cuts, we can see it’s achievable to write your own application optimisers to meet your own business needs. Wordfish Ltd is a IT consultancy specialising in infrastructure design, novel solutions and web development. If you would like some help with your IAG application optimisers, we'd love to hear from you.