MarkLogic- NoSQL before NoSQL was cool

February 25, 2012

In response to NoSQL before NoSQL was cool and the proof

If you ever needed to validate a CALS table using schematron

June 3, 2011

Here is a nifty little schematron rule to validate CALS tables.  In Particular it checks to make sure
that morerows, namest, nameend parameters are properly placed.  I hope this helps someone, it took me some time to hack it together.

<sch:rule id="TABLE_ROWS" context="*:row" xmlns:sch=""><sch:let name="entry-count" value="count(*:entry)"/><sch:let name="tgroup" value="./ancestor::*:tgroup[1]"/><sch:let name="cols-count" value="xs:integer($tgroup/@cols)"/><sch:let name="colspans" value=" sum( for $e in ./*:entry[@namest] return xs:integer(replace($e/@nameend,'[^\d]','')) - xs:integer(replace($e/@namest,'[^\d]','')))"/><sch:let name="morerows" value="(preceding-sibling::row[*:entry/@morerows])[position() eq last()]"/><sch:let name="morerows-value" value="$morerows/*:entry/@morerows[position() eq 1]"/><sch:let name="morerows-position" value="count($morerows/preceding-sibling::*:row)"/><sch:let name="rows-distance" value="count(preceding-sibling::*:row)"/><sch:let name="morerows-distance" value="$rows-distance - $morerows-position"/><sch:let name="morerows-count" value=" if($morerows) then if($morerows-distance le xs:integer($morerows-value)) then count($morerows/*:entry) - 1 else 0 else 0"/><sch:assert id="TABLE_ROW_MATCH_COLS_COUNT" test="$cols-count eq count(*:entry) + $morerows-count + $colspans" flag="ERROR">
The Count of row/*:entry elements must match column specification in tgroup
|$morerows distance: <sch:value-of select="$morerows-distance "/>

|$morerows value: <sch:value-of select=”$morerows-value”/>
|in-distance: <sch:value-of select=”$morerows-distance le xs:integer($morerows-value)”/>
|more-rows-entry-count:<sch:value-of select=”count($morerows/*:entry) – 1″/>]

The Count of row/*:entry elements must match column specification in tgroup
|$morerows distance:
|$morerows value:

I love you, your perfect, Now Change!!!

April 24, 2010

I have been working at Marklogic now for a little short of 2 months and I must say it has been an awesome experience to work in a place where everyone shares your passion for XML development and XQuery.  I can say that I find it hard to write in any other language and have to remind myself to not to start writing let statements $var as xs:string := “xxxx” or beginning a loop as a flwor statement.  With my total love for ML and XQuery, I still find myself defending my its honor to my Java/.NET colleagues. While I may win on many levels with regards to object orientation and how imperative languages are beginning to look alot like functional languages.  And how more natural (yet verbose) XML is to Object Orientation than say the Relational model.  I don’t lose to many arguments on those points.  I think I am one to fight tooth and nail to support XQuery.  I can say that Marklogic, makes building Search and REST services a trivial effort and pretty much anything that is exposed as XML can be easily worked with.   .  Yet I have been thinking for sometime that I would love to see it evolve from XML Search Platform and XML Query Language  into the Rapid Application development Platform for____fill in the blanks_______.  This evolution needs to start with writing the next killer application that can reach the masses of developers who struggle  to develop XML applications, but rely on Java/.NET because of the lack frameworks beyond Parsing, Transforming and Searching XML.

  • We need a Rails type frameworks  for web development.
  • We need a Service Bus framework for Enterprise integration
  • We need an AOP framework that allows loose coupling between our application and business logic.
  • We need the ability to inspect and parse XQuery to support dynamic programming.
  • We need better integration with our imperative brothers and sisters.
  • We need XQuery to Change!!!!

Converting Arabic to Roman Numerals

April 13, 2009

So I spent a little time cracking my head to write a number converter for roman numerals in XQuery.  Most examples from java and such use a while loop to do the conversion.  This is not possible in xquery so to simulate the while loop using recursion the solution is very simple just use a simulated queue that pops off the values while it builds the Roman Numerals.

declare variable

$romanAlpha as xs:string* :=
(“M”, “CM”, “D”, “CD”, “C”, “XC”, “L”, “XL”,“X”,“IX”, “V”, “IV”,“I”);

declare variable $romanNums as xs:integer* :=
converts arabic number to a roman numeral

declare function local:number-to-roman($num as xs:integer){
if($num eq 0) then
else if($num gt 3999) then
fn:error(xs:QName(“INVALID_ARGUMENT”),“Cannot Convert Number Larger than 3999”)
Recursion Method used to calculate the roman numeral

declare function local:recursive-roman(
$num as xs:integer,
$alpha as xs:string,
$sequences as xs:integer*){
let $i := $sequences[1]
let $rom-a := $romanAlpha[fn:index-of($romanNums,$i)]
if(fn:not($sequences) and $num eq 0) then
if($num gt $i) then
local:recursive-roman($num $i, fn:concat($alpha,$rom-a),$sequences)
else if($num lt $i) then
local:recursive-roman($num, $alpha,fn:remove($sequences,1))
else if($num eq $i) then

New Job New Challenges Same old content

March 2, 2009

For a month now I have started working with my new company McGraw Hill Companies and I must say that content is at the forefront of my challenges.  Some of the key thingsI am working on are EPublishing, Content to Layout, and the complexities of being in a larger organization.

Inverse Citation Frequency

December 15, 2008

Working for a legal publisher, we face many challenges related to content relations and keeping content relevant.  In legal content, citations set precedence for legal professionals to further relate cases or understand rulings on cases.  I have been formulating the concept of a probably well known issue, known as “inverse citation frequency”.  The principle follows that of most search engines that use inward links to a document as a mechanism for scoring the relevancy of a document. Given the number of citations found within a document, one would relate these to other documents that share the same citation or group of citations including an element of the sentiment of the cases ruling.  The identification and normalization of citations would drastically improve the cross-linking of news stories, cases to cases, etc. 

The key issue is normalization of the case citiations, while Blue book and Chicago Law,NY Style Manual have style guidelines for formatting citations, there are many permutations of how people express citations.  I have spent many hours handcrafting citation regular expressions and have found it to be a non-trivial exercise.  Sure companies like Lexis and West have mastered this functionality in product lines, but these systems are locked behind their proprietary walls. 

Anybody have any thoughts on this??

XQuery Reflection

September 17, 2008

I have been thinking about this issue for a while in regards to working on a Marklogic as an Xml Platform.  My wish is for an XQuery Reflection functionality that would allow me to inspect module metadata for various programming tasks such as dynamic programming, documentation stubs and plugin modules. So given a module at a specific uri, you could ask for an xml representation of the module in a serialized xml version.

module “urn:my-module”
declare namespace ns1 =
declare namespace ns2 =
import module namespace util = “urn:my-utilities” at “../utilities.xqy”

define function map($node as element(ns2:object)) as element(ns1:object)*


Would return:

<namespace-declaration prefix=”ns1″ uri=””/&gt;
<namespace-declaration prefix=”ns2″ uri=””/&gt;
<module-declaration prefix=”util” uri=”urn:my-utilities” location=”../utilities.xqy”/>
<function name=”map”>
   <return type=”element” prefix=”ns1″ localname=”object” cardinality=”*”/>
   <parameter name=”node” type=”element” prefix=”ns2″ localname=”object” cardinality=””/> 

 Question:  How would anybody else see this being useful?