Friday, October 8, 2010

Working with Hierarchical Data Part II




Last post (Working with Hierarchical Data) we saw that how can we store and retrieve hierarchical data. This post we will see how to retrieve hierarchical data from last node to root node and how to store non-transactional (master) hierarchical data to gain maximum performance benefit.

Retrieve Hierarchical Data in Reverse Order from Database


It can be possible that you want to know grand parent information of a child element. By implementing Superior function for hierarchical data you can achieve same functionality. Consider a requirement where some condition is applicable to one City then it should applicable to all parents Nodes of that hierarchy.


To achieve this functionality we will create GLSuperiors function. This function is also using CTE (Common Table Expression) query. GLSuperiors function is traverse through hierarchical information and returns the traversed hierarchy in opposite direction.


Please find create script for this function in attached SQL file.


SELECT * FROM GLSuperiors(‘BOM’)


Above example, returns all the parent-grandparent Geographic Location for GLCd = ‘BOM’.


Store non-transactional Hierarchical Data


Usually we are storing non-transactional hierarchical data in to the database, which are updating periodically and not taking part in transactions. We can use caching mechanism from Enterprise Library Application Block. We need to create parent-child data structure to store hierarchical data in our application and we just need to store root data for individual hierarchical data at some place (Cache).




Public Enum HierarchicalDataType

GULocation = 1
OrganizationHierarchy = 2
ProductHierarchy = 3
ServiceCenterHierarchy = 4
End Enum

Public Class HierarchicalData
Private dataType As HierarchicalDataType
Private parent As HierarchicalData
Private hierardhicalDataCD As String
Private hierardhicalDataDesc As String
Private order As Integer 'Optional
Private child As IList(Of HierarchicalData)
End Class



Now you just need to populate a tree using your hierarchical data. This should be one time activity you can call from Application_Start and you have to store the root node in to the cache. For better accessibility you can store node information in key-value combination (HashTable), where key is hierarchical data CD (must be a unique) and value is the main HierarchicalData object.




Consider that we have one webpage where there is some dependent drop down for Country and user is selecting “India” (CD: ‘IND’) then it should directly return all the Geographic Locations (State) with “India” as parent.




Return DirectCast(hierarchicalData("IND"), HierardhicalData).child




This reduces database call, and giving performance to the non-transactional hierarchical data.


You can also maintain order, which can be useful to display in order. This is useful to create menus.


By manipulate such functions, you can use them in AJAX, web services etc…




To understand Enterprise Library Caching please refer: http://www.codersource.net/asp_net__enterprise_library_caching_block.aspx



Database Script


----- create the Superiors function using CTE  

if object_id( 'GLSuperiors', 'IF' ) is not null drop function GLSuperiors
GO
create function GLSuperiors( @GUCD varchar(10))
returns table as return with GLSupnodes
( distance, GUID, GUCD, GUDesc, ParentGLCD ) as
(select 0, h.GUID, h.GUCD, h.GUDesc, h.ParentGUCD
from [GLHierarchy] h where h.GUCD = @GUCD
union all
select distance-1, h.GUID, h.GUCD, h.GUDesc, h.ParentGUCD
from [GLHierarchy] h inner join GLSupnodes sn on h.GUCD = sn.ParentGLCD )
select distance, GUID, GUCD, GUDesc, ParentGLCD from GLSupnodes
GO


select * from GLSuperiors('BOM')

Thursday, October 7, 2010

Working with Hierarchical Data

Retrieve Hierarchical Data from Database
Hierarchical Data Model is proven modeling technique to implement hierarchical business entities like (GL Hierarchy, Organization Hierarchy, Menu Hierarchy etc…). We can also use it to retrieve role based menu from database. Ideally hierarchical data is representing by level (depth) and consider it should use Tree Data Structure Algorithm. Now it is important to design such table using Hierarchical Data Modeling and creating queries which will return hierarchical data in optimized manner.
Create a hierarchical table like below (Geographic Location Hierarchy).


Insert some records in this table. (Please find insert script in attached SQL file)
Now it is important to retrieve data of this table in hierarchical format. That means if we are requesting for particular Geographic Location (say ‘India’) then query should return all child-grand child records for India. To achieve this we will use CTE (Common Table Expression). Please refer http://msdn.microsoft.com/en-us/library/ms190766.aspx for other usage of CTE.
Consider that you have requirement where you want return Geographic Location tree by passing GLCd. To achieve this you need to write CTE query. CTE query are bit complex to understand so we will create some functions to simplify this query.
We will create GLSubordinates function, which accept GLCd (GL Code). Please find create script for this function in attached SQL file.
SELECT * FROM GLSubordinates('IND')

Above example, returns all the child-grandchild Geographic Location for GLCD = IND.
NodeSequence is optional field, it is just showing depth of tree you can see 2, 2.5, 2.5.9, 2.5.10 etc…

CTE Queries performs quite efficiently. With index on GLID and GLCD with data up to 12 hierarchical levels for 24000 records this query is returning data less than a second.
By default recursion limit for CTE queries is 100 levels. And I guess it is enough level for normal hierarchical data, but you can also change it. Please refer http://msdn2.microsoft.com/en-us/library/ms181714.aspx.

Usage
You can use this function directly or in your queries. By adding Level check, you can customize this function to return tree up to specified level.

SQL Script

CREATE TABLE [GLHierarchy](
[GUID] [int] IDENTITY(1,1) NOT NULL,
[GUCD] [varchar](10) NOT NULL,
[GUDesc] [varchar](50) NOT NULL,
[ParentGUCD] [varchar](10) NULL
) ON [PRIMARY]
 
 
/* Insert queries */
 
INSERT INTO [GLHierarchy] Values ('AP0','Asia-Pacific',null)
 
-- Level 2 Records -- 
INSERT INTO [GLHierarchy] Values ('IND','India','AP0')
INSERT INTO [GLHierarchy] Values ('PHL','Phillipines','AP0')
INSERT INTO [GLHierarchy] Values ('CHN','China','AP0')
 
 
-- Level 3 Records -- 
INSERT INTO [GLHierarchy] Values ('MAH','Maharashtra','IND')
INSERT INTO [GLHierarchy] Values ('KAR','Karnataka','IND')
INSERT INTO [GLHierarchy] Values ('GUJ','Gujarat','IND')
 
INSERT INTO [GLHierarchy] Values ('MNL','Manila','PHL')
 
 
-- Level 4 Records --
INSERT INTO [GLHierarchy] Values ('BOM','Mumbai','MAH')
INSERT INTO [GLHierarchy] Values ('PUN','Pune','MAH')
 
INSERT INTO [GLHierarchy] Values ('BAN','Bangalore','KAR')
 
 
----- create GLSubordinates function using CTE
if object_id( 'GLSubordinates', 'IF' ) is not null drop function GLSubordinates
GO
create function GLSubordinates( @GLCD varchar(10) ) 
returns table as return with GLSubnodes( distance, GLID, GLCD, GLDesc, ParentGLCD, NodeSequence )
AS ( select 0, h.GLID, h.GLCD, h.GLDesc, h.ParentGLCD, convert( varchar(80), ltrim(str(h.GLID))) as NodeSequence 
from [GLHierarchy] h 
where h.GLCD = @GLCD 
union all 
select distance+1,  h.GLID, h.GLCD, h.GLDesc, h.ParentGLCD, convert( varchar(80), sn.NodeSequence+'.'+ltrim(str(h.GLID))) 
from [GLHierarchy] h inner join GLSubnodes sn on h.ParentGLCD = sn.GLCD)
select distance, GLID, GLCD, GLDesc, ParentGLCD, NodeSequence from GLSubnodes
GO
 
--- Test Scripts ---
select * from GLSubordinates('IND') 

Monday, October 4, 2010

Scope of Return Statement

Last week my colleague came up with one question about Scope and Reachable Code in .Net and we started understanding Reachable Code and found very interesting results.

We all know that in try-catch block, finally block must execute before returning from method, then what happened with variables declaring in Finally Block, when they are getting assign and what is use of them.

Guess, what will be the return value when calling MyFunction method in below code?

private int MyFunction()

{

            int i = 0;


try {

i = 1;

                throw new Exception("Int Error");

}

catch {

i = 2;

return i;

}

finally {

i = 3;

Debug.Print(i.ToString());

}

}

If you execute above code, just after setting value of i to 2 after executing return statement, finally block should called. And in finally block setting value of i to 3, even Debug.print should print 3 but when you get return value (return i), you will get 2 as an output.


Guess, what had happend.


Let me change existing code, let me pass a reference parameter and we will see what happen to reference variable.

private int MyFunction(ref int passInt)

{

int i = 0;

try{

i = 1;

passInt = 1;

throw new Exception("Int Error");

}

catch{

i = 2;

passInt = 2;

return i;

}

finally{

i = 3;

passInt = 3;

Debug.Print(i.ToString());

}

}


In above code, passed parameter value becomes 3 but return value becomes 2.

Now try to understand what had happened in previous code.


“Finally statement is always calling before returning from function”, that means return statement executed but value yet not returned. Then in that case ever after changing value of i in finally block, why return value persisting previous value.

Now go to few years back, and think that every function is called by a pointer and address of that function is set to callee. Now you will understand what had happened.

Below is a code generated by complier before creating IL code.

private int MyFunction()

{

int CS$1$0000;

int i = 0;

try{

i = 1;

throw new Exception("Int Error");

}

catch{

i = 2;

CS$1$0000 = i;

}

finally{

Debug.Print(i.ToString());

}

return CS$1$0000;

}

See that return statement has been reset and return statement placed at then end of function. Also one variable CS$1$0000 added to set return value.

You will see that in Catch block, value of i is set to CS$1$0000 variable and in turns CS$1$0000 is getting return from method.

IL version of above code will give better explanation with stack trace; you will also get clear when you will execute IL code with IL compiler.

IL Version:

.method private hidebysig instance int32 MyFunction() cil managed
{
    .maxstack 2
    .locals init (
        [0] int32 i,
        [1] int32 CS$1$0000)
    L_0000: nop 
    L_0001: ldc.i4.0 
    L_0002: stloc.0 
    L_0003: nop 
    L_0004: ldc.i4.1 
    L_0005: stloc.0 
    L_0006: ldstr "Int Error"
    L_000b: newobj instance void [mscorlib]System.Exception::.ctor(string)
    L_0010: throw 
    L_0011: pop 
    L_0012: nop 
    L_0013: ldc.i4.2 
    L_0014: stloc.0 
    L_0015: ldloc.0 
    L_0016: stloc.1 
    L_0017: leave.s L_002b
    L_0019: nop 
    L_001a: ldc.i4.3 
    L_001b: stloc.0 
    L_001c: ldloca.s i
    L_001e: call instance string [mscorlib]System.Int32::ToString()
    L_0023: call void [System]System.Diagnostics.Debug::Print(string)
    L_0028: nop 
    L_0029: nop 
    L_002a: endfinally 
    L_002b: nop 
    L_002c: ldloc.1 
    L_002d: ret 
    .try L_0003 to L_0011 catch object handler L_0011 to L_0019
    .try L_0003 to L_0019 finally handler L_0019 to L_002b
}
 

Summary:

Compiler is setting return value before returning a method and changes into the method level variable is not affecting return value.


Note: I have used Red Gate’s Reflector to get IL code.


Please share me your understanding on this.